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MEMORANDUM  FOR  THE  SECRETARY  OF  DEFENSE 

THE  UNDER  SECRETARY  OF  DEFENSE 

SUBJECT:  Final  report  of  the  Defense  Science  Board  Task  Force  on 
Improving  Test  and  Evaluation  Effectiveness 


Attached  is  the  final  report  of  the  Defense  Science 
Board  Task  Force  on  The  Contributions  of  Modeling  and  Simulation 
(M/S)  to  Defense  Test  and  Evaluation,  chaired  by  BGEN  Robert  A. 
Duffy,  USAF  (Ret.)*  This  repoirt  highlights  a  number  of 
significant  steps  regarding  the  use  of  models  and  simulations 
that  can  result  in  current  and  future  improvements  in  test  and 
evaluation. 

The  Task  Force  determined  that  M/S  can  help  provide  more 
illumination  of  choices  in  the  operational  requirement  process, 
increase  flexibility  in  the  development  process  and  assist  in  the 
preparation  of  early  operational  assessments.  As  an  exeu&ple, 
by  placing  more  emphasis  on  early  and  continual  operational 
evaluation  during  development,  operational  problems  can  be 
identified  early. 

A  general  task  force  consensus  is  that  a  process 
needs  to  be  established  that  translates  the  operational 
requirements  into  an  evaluation  freunework.  Models  and 
simulations  are  expected  to  play  a  key  role  in  the  development  of 
this  framework.  At  present,  the  translation  of  requirements  to 
technical  criteria  and  then  into  an  evaluation  framework  is  judged 
to  be  ambiguous. 

I  suggest  that  you  read  the  attached  letter  from  the 
Chairman,  the  Executive  Summary  and  recommendations,  and  approve 
the  report  for  publication. 


Attachments 
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i 
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OFFICE  OF  THE  SECRETARY  OF  DEFENSE 
WASHiNCniON.  O.C  20301  -3140 


DEFENSE  SOENCE  ^  ^  1383 

BOARD 

Mr.  Robert  R.  Everett 
Chairman 

Defense  Science  Board 
Dear  Mr.  Everett: 

Attached  is  the  final  repoirt  of  the  Defense  Science  Board 
Task  Force  on  Improving  Test  and  Evaluation  Effectiveness.  The 
Task  Force  identified  a  number  of  potential  uses  of  models  and 
simulations  to  improve  test  and  evaluation  emd  the  acquisition 
process  if  employed  early  and  effectively  in  the  system  life 
cycle.  The  use  of  models  and  simalations  can  amplify  and  expand 
our  understemding  of  system  and  mission  requirements,  system 
effectiveness,  and  costs  resulting  from  acquisition  decisions. 

To  achieve  this  potential  a  need  exists  for  a  process 
that  will  provide  an  independent,  objective  evaluation  of  model 
and  simulation  utilization.  Credibility  of  models  and  subsequent 
simulations  are  importaint  and  must  be  considered  in  light  of 
their  application  to  a  specific  problem. 

The  Task  Force  has  recommended  several  significant  and  broad 
actions  to  improve  test  and  evaluation  and  the  acquisition 
process : 

o  Support  eeurly  involvement  of  the  operational  test 
community  in  the  development  phase  through  a  process  that  uses  an 
evaluation  framework. 

o  Use  simulations  to  help  deteraine  the  events  and  criteria 
that  must  be  tested. 

o  Conduct  excursion  amd  sensitivity  analyses  to  focus  on 
system  engineering  criteria  that  validates  modeling  results  and 
contributes  to  an  early  operational  assessment. 

o  Support  acceptance  of  simulation  as  an  evaluation  tool 
by  increasing  development  phase  flexibility  through  a  process 
that  allows  re-evaluation  as  threat,  technology  and  knowledge 
evolve. 

o  Involve  users  early  with  mock-ups  of  man/machine 
interfaces  to  enable  a  better  understemding  of  the  system  design. 


Hi 


I  want  to  thamk  all  of  the  fflembers  of  this  panel  for  their 
contributions  to  this  report. 


Sincerely, 


BGEM  Robert  A.  Duffy,  USAFl  (Ret. ) 
Chairman,  DSB  1989  Summer  Study  Task 
Force  on  Improving  Test  and 
Evaluation  Effectiveness 
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EXECUTIVE  SUMMARY 


Over  the  last  several  years,  models  and  Mmuifttitvng  have  been  increasingly  used  to 
support  the  development,  test  and  evaluation  process  for  new  weapon  systems.  This 
practice  is  expected  to  continue.  Ovit  the  last  ten  years  we  have  seen  dynamic  growth 
in  the  computing  and  networking  technology  aress^  which  axe  the  underpinning  for 
digital  simulation.  This  trend  is  expected  to  contintte,  and  will  permit  lower  cost  and 
higher  fidelity  «inmi*tirwM.  As  a  result,  the  Task  Force  on  Improving  Test  and 
Evaluation  Effectiveness  was  requested  to  look  at  ways  of  improving  the  use  of  modeling 
and  .wnmiatifHi  as  tools  in  the  test  and  acquisition  of  defense  systems. 

A  number  of  significant  steps  were  identified  regarding  the  use  of  models  and 
gitniiiatirmii  that  can  Tcsult  in  immediate  and  future  improvements  in  test  and  evaluation. 
Modeling  and  aitnuiatinw  can  be  an  effective  tool  in  the  acqtiisition  process  throughout 
the  systems  life  cycle,  but  most  importantly  if  employed  at  the  inception  of  the  system’s 
existence.  Mndgiiwg  and  sittmUtirtfi  can  contribute  knowledge  and  understanding  of 
system  and  requirentents,  system  effectiveness,  and  costs  resulting  from 

acqiiiiriti<>n  but  Only  if  used  properly. 

The  force  came  to  several  conduaons  with  regard  to  the  use  and  credibility  of 

and  m  the  test  and  evaluation  and  acquisition  ptoceasec 

0  The  foundation  of  the  process,  the  operational  requirement  and  its 

translation  into  system  terms,  can  be  improved  thrm^  the  use  of  modeling  and 

rimttUtinn. 

0  An  early  and  continual  involvement  of  the  OT&E  community  in  the  requirements 
process  can  contribute  and  improve  the  acquisition  process. 

0  A  development  program,  as  embodied  in  a  specification  and  contract,  may  become 
overly  x^ld,  restricting  the  willingness  to  evaluate  and  incorpoxate  chan^  as 
threat,  technology  and  knowledge  evolve.  Modeling  and  simulation  can  be  used 
as  a  tool  for  cootinual  evaluation  of  potential  changes. 

o  In  cases  where  and  simulation  raises  items  of  uncertainty  in  terms  of 

system  xequirementi  to  achieve  operational  utility,  unanticipated  early 
operational  tests  may  be  warranted  even  while  a  system  is  still  under 
development. 

o  Accounting  for  performance  early  in  system  acquisition  improves  system 

capability  and  tthawe—  the  ability  of  the  teet  and  evaluation  process  to  predict 
operational  performance. 

0  The  availahility  of  hi^  quality,  reusable  and  simulations  could  decrease 

redundant  efforts  while  improving  quality  of  key  elements. 

The  credibility  of  a  cannot  be  considered  separately  from  its  application  to  a 

specific  problem,  validity  of  data  inputs,  and  qualifications  of  those  executing  the  model 
and  interpreting  the  teaulta.  Current  DAB  documentation  does  not  addieae  model  and 
■itniiiatimi  cTediUlity.  Advanced  technology  in  both  hardware  and  software  offer 
opportunities  for  improving  the  ezedihility  and  ^plicahility  of  models  and  simulations,  but 
ffintimifi*  xeseezch  is  needed.  In  view  of  the  limitatians  of  models  and  simulations,  the 
approach  for  effective  uae  <£  mndoHwg  and  simulation  in  the  operational  lequirementi 


area  should  be  one  of  identifying  areas  where  uncertainty  levels  require  early 
operational  testing.  Since  operational  testing  is  expensive,  the  isolation  of  areas  which 
require  tests  is  important. 

Early  in  the  course  of  this  study,  the  task  force  discovered  that,  in  order  to  make 
useful  suggestions  on  modniiti^  and  simulation,  corresponding  leconunendations  in  the 
areas  of  test  and  acquisition  would  likewise  be  necessary.  The  task  force  has 
recommended  several  significant  and  broad  suggestions  to  improve  test  and  evaluation  and 
the  acquisition  process: 

o  EmphasiM  the  cooperative  learning  role  for  operational  testing  diiring 

development,  and  support  this  activity  through  a  process  that  uses  an  evaluation 
framework  established  at  the  start  of  a  program. 

o  Do  not  employ  simulations  to  prove  or  disprove  things,  but  exploit  their 

ability  to  isolate  hi^  sensitivity  areas.  Simulation  has  an  important  role  in 
providing  sensitivity  snslyaea,  and  as  e  method  of  focusing  on  s3rBtem  engmeerin^ 
issues  early  through  operational  testa 

o  Confidence  in  models  can  be  enhanced  by  employing  them  for  ezcuxsion  and 
aenaitivi^  analyses^  and  focusing  on  critical  isnies  by  running  testa  and 
validating  the  reniltt.  It  ia  not  feasible  or  coet  effective  to  set  up  a  central 
office  to  accredit  modes,  nor  is  it  neceamry  to  implement  s  management  process 
to  distribute  and  reuse  simulstinns. 

0  OSD  must  allow  the  devdopment  pbaae  to  become  leas  rigid  and  support  the 
acceptance  of  simulatinn  as  an  evaluatian  tooL 

o  The  tools  are  available  and  the  cost  ia  sufficiently  low  such  that  every  program 
build  mock-upa  of  Tnan/wiMriiiw*  interfaces  as  early  as  poarible,  and 
involve  actual  users  to  better  undexitind  the  utility  of  the  ayetem  design. 

This  study  has  detexmined  that  it  is  important  to  antidpatB  operational  test  issues, 
both  for  reaaona  of  cost  and  credibility.  More  emphasis  on  operational  testing  is  needed 
during  system  development  so  that  operational  problem  areas  can  be  ideatifisd  while  they 
are  still  economically  resolvabls.  A  process  must  be  eriahlWied  that  defines  evaluation 
frameworks  which  predict  ptobaUe  evaluation  procedures;  then,  as  the  teal  program 
prnflrfrs.  the  frameworks  be  upgraded  consistent  with  the  advancing  state  of 

knowledge.  Also,  it  must  be  acknowlet^ed  that  the  current  acquiwtion  process  stifles 
evaluation,  and  it  is  zecommsnded  that  upper  management  levels  piovids  direction  to 
develop  more  open  attitude!  regarding  the  reeponsibUitiee  and  contribtttians  of  the  teet 
and  evaluation  ii^  the  acquisition  prorcm 

Hsally,  the  task  fores  found  no  need  to  estshliah  in  independent  agency  or  office 
to  accredit  or  the  nee  or  distribution  of  simulations.  Moving  and  simulation 

faifi  «fid  ■itfpiM  t)0  used  to  focus  testing  into  functional  and  operational  areas 
where  there  is  a  lick  of  aanuance^  and  to  recognize  those  axeae  where  sensitivity  is 
sufficiently  quernkmihle  *****  actual  tetting  is  in  order.  In  this  way  confidence  in  the 
final  pnxhict  is  realized  >  through  testing  married  with  ■imnUtiftn- 
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SECTION  1  -  orntooucnoN 

Recent  demande  to  reduce  DoD  spending  and  costs  asKciated  with  systems 
acq»isifinn  lot  both  hardware  and  software  development  activities  have  prompted  a  look 
at  the  acquisition  life  cycle.  More  speciHcally.  the  phases  of  development  and 
operational  testing  continue  to  reflect  high  visibility  as  critical  points  to  assess  system 
credibility.  The  posribility  that  the  addition  of  simulation  to  the  ptocess 

might  provide  valuable  ioaights  was  raised.  As  a  result,  the  Defense  Science  Board  was 
tasked  (Terms  of  Reference,  Appexidix  A)  to  study  how  to  improve  the  effectiveness  of 
test  and  evaluation  with  modeling  and  simulation  (M/S)  as  the  focus.  The ‘tasking 
specifically  requested  the  DSB  to: 

0  Review  prevailing  use  of  models,  laboratory  tests  and  Held  testa. 

0  Determine  appropriate  situations  in  which  to  test  and/or  modeL 

0  Determine  the  required  fidelity  of  repreaentttion  of  tbs  system  under  test  and 

its  environment. 

0  Determine  which  discipline  will  govern  the  interpretatiaa  of  resulta. 

The  Task  Force  memberahipb  Appendix  heard  a  variety  of  pteaentatioDai,  which 
are  listed  in  Appendix  C 

MODELS  Aim  SIMULATTONS 

Models  and  simnlatiana  have  been  and  continue  to  be  used  extensively  to  mxppoct 
the  weepon  develqpman^  tm  and  evaluation  procaas.  Such  use  can  be  expected  to 
continue  •  if  not  innreaw  •  in  the  future. 

Defenea  ayetama  axe  iocteasingly  complex.  Their  oparatiosial  utility  depende 
increasingly  on  succeesfUl  performance  at  extended  xugsn  The  integietian  of  senior 
information  from  multiple  parts  of  the  electromagnetic  spectrum  or  from  multiple  kuzccb 
if  hann—iwg  a  tignlflcaat  factor.  The  eyatsine  are  required  to  operete  in  advetee 
environmeati  (weather,  hoetile  elacttam^netie,  specs,  enemy  coontermeeauxe^  To  be 
effective,  they  muet  imerKt  with  other  systeme,  often  over  great  dletancen  Advanced 
enmtMiid  and  oonxiol  are  xequired  to  overcome  them  difflcultiee.  Fortharmonb 

system  now  being  developed  ere  expected  to  remain  effective  in  ths  fdture  againat 
threats  that  will  evolve  in  weye  not  fdlly  predicteUe.  A  oomplsti  tmt  of  s^  syaieiue 
would  be  large  in  mpe  end  require  the  generation  of  condWena  that  are  difficalt  if 
not  impomiMe  to  ereete  timrt  of  actual  cMbat.  The  practicalitiae  of  com;  test  ranga 
spnea,  availahility  of  advencad  threat  syiWBi/innogattn  mfeiy.  ete.,  will  nscsTily 
tMt  date  uvnUahility.  Forming  an  overall  evehution  of  a  major  eystnuh 
patformenoe  will  aloom  alwaye  req^  a  "tnodal*  >  evaa  if  only  a  asntal  one  •  to 
integrate  the  nenOahle  vm  date  end  to  extr^tdate  to  thoaa  condttiona  which  cannot  ba 
creaad  la  tha  tmi  aaviraanMnt.  Thwa  nodala  and  tiainlationa  ate  not  teplacementa  for 
tarn  data,  but  tathar  oemplamantany  toola  in  tha  evaluation  procomL 

bi  tha  TirnoilT  aanaa,  a  modal  ia  a  zepreaantation  of  an  objtct,  ayiiank  or  procem 
(or  a  part  thaiaof)  ia  OMthaomtical.  phyeteal  or  logteal  terma,  uaually  almpHfied,  oftan 
or  ahmrack  aarviag  aa  a  baaia  for  calculationa,  predktiOBa  or  fdrthar 
iavemifatiaBa  ia  a  mchnigtia  for  aiparfanantatioa  ia  whidi  tha  oparatka  and 

dynaatica  of  a  tea^wtald  ayataai  are  or  reprodnoad  hy  nma  diffareat  ayHam, 
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usually  involving  one  dr  more  models.^  Under  these  dennitioos,  a  scaled  representation 
of  an  aircraft  is  a  model;  placing  that  representation  in  a  wind  tunnel  to  study  flight 
dynamics  would  be  a  simulation. 

The  basic  concept  underlying  these  deHnitions  is  that  a  model  is  some  abstraction 
that  embodies  our  understanding  of  a  system  while  a  simulation  represents  the  dynamic 
exetciae  of  one  or  a  set  of  models. 

A  large  variety  of  simulations  have  been  developed  to  support  differmit  aspects  of 
the  development  and  evaluation  process.  cimuUHoM  differ  as  to  the  scope  ’of  the 
"system"  they  attempt  to  reproduce.  Simulations  are  available  at  the  sirstem/subsystem 
level  which  typically  are  baaed  on  detailed  models  of  the  relevant  physics  or  engineering 
of  the  system.  Combat  simulations  can  be  at  the  ooKH-oub  level,  few-on-few,  many- 
on-many.  battalion,  theater,  etc.  Generally,  these  simulstinns  aactifico  detail  in  the 
modeling  of  individual  systems  in  return  for  an  increase  in  scope. 

A  simulation  can  be  only  u  good  as  the  knowledge  iscoipotatad  in  the  models  it 
exerdsefc  Since  the  ability  to  model  human  petformanoe  and  jg  yioy 

limited,  the  degree  to  which  mndaiing  <jf  human  factca  is  needed  and  tbs  manner  in 
which  this  is  attempted  is  an  important  charactetistk  of  a  milimxy  simulation.  Physical 
and  «wi£iiMiaTifi£  m/yiaia  niay  oot  retpute  it.  Many  aiwiMiatinM  gjg  analytic  in  that  human 
f acton  are  accounted  for  I7  some  set  of  algorithms  at  Amriainn  xaim  which  stay  fixed 
during  the  simulation  run.  Interaetive  simuletiona  exist  which  allow  differing  dames  of 
human  involvement.  Some  require  higher  level  decisions  to  bo  made  by  humans  (oig, 
tactics  of  a  unitX  while  the  modaiifu  at  the  individual  platfonn/weiVOB  level  doee  not. 
Manned  aimulaton  leplkam  systems  in  greeter  detail  and  require  a  person  to  "operate" 
the  individual  system,  famulaton  can  inoarponte  the  actual  operating  software  of 
subeystemn  Manned  simuiatom  for  both  air  and  ground  rysteme  are  being  incteetilngly 
intenected.  Herdwsre-in-tbe-loop  aimutotiou  ie  en  analogooe  technique  ssherein  the 
ectuel  hardware  is  made  to  operate  by  e  timuleted  etimulun 

The  variety  of  timutetion  typae  end  tite  number  of  individnal  afanuietione  udikh  have 
been  developed  reflect,  in  part,  the  range  of  eveluetion  decishsie  they  are  ea^ected  to 
support. 


^Them  "daflnitioair  have  bean  eyathaeizad  from  multiple  muroee  including  mchnicel 
mmA  fMgntl  purpoee  - im  end  the  definitiaM  given  in  DoD  Directive  5000A 
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SECTION  2  -  FINDINGS 


Acfluiatjon  Exosoa 


Effective  systems  development  involves  processes  like  planning,  analysis,  decision 
coordination,  requirements  definition,  funding,  design,  development,  fabrication  and  test 
and  evaluation.  This  set  of  activities  we  term  the  "acquisition  process’*. 

The  acquisition  process  can  function  in  a  number  of  ways  depending  on  the  focus  of 
the  organization  tasked  with  the  responsibility  to  develop  the  military  system. 

Early  in  the  acquisition  process  many  of  the  activities  focus  on  the  development  of 
concepts,  requirementsi  and  preliminary  designs  that  ate  not  well  defined  or  understood. 

There  are  uncertainties  and  ambiguities  that  arise  as  the  system  concept  is  formulated 
with  the  focus  toward  operational  utility.  This  pert  of  the  process  tends  to  be  ’loose"  in 
the  context  of  an  open  set  of  sciences  and  technologies  that  are  addresnng  the  problem 
of  defining  Operational  requirements. 

As  illustrated  in  Figure  2-1.  the  notion  of  the  ptoceae  being  loose”  at  the  front- 
end  of  the  system  development  is  characterized  by:  1)  the  lack  of  experience  with  formal 
TP^»thnd»  to  define  accurately  system  requirements;  2)  a  entwwinwicatifm  gap  between  the 
operational  and  engineerii^  in.  the  translation  of  complex  requirements,  3) 

insufficient  availability  of  analysis  tools  to  assist  in  the  foimulatian  of  requirements;  and 
4)  the  absence  of  the  OTAE  community  as  an  observer  to  the  front-end  activities  of  the 
program,  which  tends  to  cause  surprises  when  viewed  at  the  end.  Transition  through  the 
development  phases  causes  the  ambq^uoos  capabilities^  that  were  derived  as  requirements, 
to  become  rigid  and  fixed  in  the  form  of  technical  specifications  and  contract  langu^ 
which  has  been  dictated  by  precedent  and  regulation. 

Programs  typically  remain  many  yesra  in  the  development  procesSi  The  longer  a 
program  remains  in  the  development  cycle  the  more  likely  that  changes  to  mission 
requuements  will  occur  requiring  reevaluation  of  the  fundamental  concepte  and  tradeoffs 
that  underlie  the  **«*■"««*»  specifications.  The  threat  that  generated  the  need  in  the 
first  instance  is  a  moving  target. 

The  current  process  leaves  little  flexibility  CLb.,  as  "r^id”  basriinfid  specifications) 
for  integrating  changes  to  systems  under  development,  when  those  changes  are  caused  by 
new  technology  or  oatiooal  priority  or  threat-motivated  changes  to  miasinn  requirements. 
While  H«i«g  is  a  positive  step  to  produce  program  stability  for  long  lasting  programs, 
it  can  stifle  needed  changes.  When  changes  are  applied  they  add  cost  to  oontracti^ 
adversely  affect  ,  and  can  cause  contractor  or  government  deviations  from  their 

contractual  consaitaanta. 

Recent  duaga  to  OoD  standards  and  policia  have  streamlined  the  acquisition 
reporting  chaim  and  may  siaplify  the  generation  of  **^*«"<"»*  development  requirementa. 
Even  with  the  of  current  changa  to  the  acquisition  process  however,  the 

potential  for  delay  rcaains  when  changa  must  be  introduced.  Unlea  development 
cstt  be  reduoed  or  flexibility  can  be  provided  in  the  scquiaition  prooese,  the 
potential  for  ntissed  technology  and  delayed  program  changa  will  increaa  acquitition 
coBiB  and  reduce  our  ability  to  respond  to  the  evolving  thrat. 

As  ■yMm  transitians  to  the  operational  community  for  *— the  realixation 
that  the  qpentiooal  utility  of  the  system  may  be  deficient  coma  too  lea  for  changa  to 
be  rapidly  and  coonomically  applied  to  correct  the  system. 


IMPROVING  T&E  EFFECTIVENESS 


Figure  2-1 


IMPROVING  T&E  EFFECTIVENESS 


Figure  Z^Z 


Operational  testing  returns  focus  to  the  mission.  The  current  state  of  the  system 
requirements  and  the  ability  of  the  developed  system  to  attain  operational  utility  against 
that  requirement  is  audited.  With  regard  to  the  original  mission  requirements,  the  result 
of  system  changes  in  need,  understanding,  and  people  in  responsible  positions,  create  new 
measures  of  effectiveness  at  a  very  late  stage.  This  activity  frequently  creates  surprises 
and  may  well  require  costly  changes  in  the  system  during  or  after  deployment. 

Sok  of  Test  lad  Evaluatioii  ja  Aegiiigititw  Procesa 

As  stated  in  DoD  Directive  5000.3,  The  primary  purpose  of  all  T&E  is  to  malte  a 
direct  contribution  to  the  timely  development,  production,  and  fielding  of  s3rstems  that 
meet  the  users’  requirements  and  are  operationally  effective  and  suitable.”  It  is 
generally  agreed  that  this  should  be  accomplished  by  a  continuous  assessment  of  the 
system’s  capabilities  (or  potential  capabilities)  as  it  progresses  through  the  process. 

Defense  Test  and  Evaluation  is  organized  into  two  parts:  Development  T&E  and 
Operational  T&E.  DT&E  is  conducted  to  assist  in  the  engineering  design  of  the  systems 
as  well  as  to  verify  attainment  of  technical  performance  specifications,  objectives  and 
supportability  (as  identified  in  the  contract  between  the  Government  and  the  Contractor). 
OT&E  is  conducted  to  determine  the  operational  effectiveness  and  suitability  of  the 
system  for  use  in  combat  by  typical  military  users. 

Test  and  evaluation  is  a  critical  component  of  our  *Ti«ting  acquisition  process.  The 
test  and  evaluation  world  is  split  into  two  general  communities  of  development  and 
operational  test  and  evaluation.  Each  plays  a  distin'.t  role  in  the  acquisition  process,  as 
pictured  in  Hgure  2-2,  but  both  exist  to  help  field  operational  weapon  systems  that  work 
and  are  effective  in  combat  conditions.  DT&E  is  focussed  on  the  system  work 

and  OT&E  is  focussed  on  how  well  it  works. 

OTAE 

DoD  SOOOJ  states  that  DT&E  is  conducted  throughout  various  phases  of  the 
acquisition  process  to  ensure  the  acquisition  and  fielding  of  an  effective  and  supportable 
sjrstem  by  assisting  in  the  engineering  desiga  and  development  and  verifying  attainment 
of  technical  performance  specifications,  objectives^  and  supportability.  DT&E  is  an 
integral  part  of  the  full-scale  develojanent  process.  They  are  constantly  reviewing  the 
design,  prototypes,  and  development  test  results  against  the  functional  and  technical 
specifications. 

Development  testing  covers  a  wide  range  of  components  and  conditions  rangiz^  from 
material  sample  tasting  to  full  up  system  testiz^.  The  purpose  of  development  testing 
includes  tests  to  evalosts  design  approaches,  tests  to  collect  data  to  validate  analytical 

tests  to  aasesi  technical  risk,  teste  to  verify  ruehnie*!  performance,  tests  to 
demonscrats  spsdficstlon  compliance,  and  tests  to  predict  operational  performance.  All 
these  tests  azs  foeaassd  on  ensuring  that  the  development  process  yields  a  system  that 
enmjAitm  -with  ths  e|iiidficationa 

Developmene  evaluation  has  historically  focussed  on  evaluating  the  results  of 
development  tasting  against  ths  requirements  outlined  in  ths  technical  specifications. 
Evaluation  pi««fii«g  mcsc  often  follows  the  test  effort,  thus  the  evaluation 

methodology  is  most  often  driven  by  the  test  events  alx^y  planned.  The  evaluator  must 
make  do  with  the  results  available  from  the  development  tests. 
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QT&E 


Title  10  U^C  Section  138  define  0T&£  as  "the  field  testing,  under  realistic 
combat  conditions,  of  any  item  of  (or  key  component  of)  weapons,  equipment,  or 
munitions  for  the  purpose  of  determining  the  rffectiveness  and  suitat^ty  of  the  weapons, 
equipment,  or  munitions  for  use  in  combat  by  typical  military  users;  and  the  evaluation 
of  the  results  of  such  tests.” 

OT&E  is  tasked  with  field  testing  weapon  systems  in  realistic  conditions  to 
determine  effectiveness  and  suitability.  OT&E  is  the  key  player  in  the  operational 
testing  portion  of  the  acquisition  process.  They  perform  the  final  examination  of  the 
weapons  system  to  determines  its  effectiveness  against  the  Required  Operational 
Capabilities  (ROCX 

As  prescribed  by  law.  operational  testing  must  test  as  much  of  the  weapon  S3rstem 
as  possible  in  conditions  as  near  as  are  achievable  to  actual  combat.  This  mandate  is 
difficult  to  achieve  because  of  constraints  due  to  cost,  security,  safety,  test 
instrumentation,  terrain,  treaty  compliance,  and  many  others.  For  example,  one  cannot 
kill  soldiers  to  determine  the  kill  effectiveness  of  a  new  bullet  or  rifle  design.  The  law 
also  requires  operational  testing  to  determine  operational  effectivenesa  and  suitability. 
Operational  effectiveness  as  deHned  in  OoD  S0003-M-1  means:  "The  overall  degree  of 
miwion  accomplishment  of  a  sjrstem  when  used  by  repreaentttive  personnel  in  the 
environment  planned  or  expected  for  operational  employment  of  the  system  considering 
organization,  doctrine,  tactics,  survivability,  vulnerability,  and  threat  fineiuding 
countermeasures  and  nuclear  threatsX” 

Operational  suitability  is  deHned  as:  "The  degree  to  which  a  system  can  be 
satisfactorily  placed  in  Held  use,  with  oonaidetation  given  to  availability,  compatibility, 
transportability,  reliaWlity,  wartime  use  ratesk  maintainability,  safety,  human  factors, 
manpower  supporubility,  logistic  supportability,  documentation,  and  training 
requirements." 

The  determination  of  operational  effectiveness  and  suitability  as  defined  above  is 
impossible  baaed  solely  on  fleU  test  results  without  further  analysia.  This  further 
analysis  or  evaluation  must  rely  mote  and  more  on  modeling  and  simulation  as  weapon 
systems  increase  in  complexity  and  in  light  cf  the  various  coosctaints  placed  on  field 
testing. 


TEMP 

The  Test  EvaluatioB  and  Master  Plaa  (TEMP)  defines  and  integrates  the  DT&E  and 
OT&E  efforti  for  all  major  we^on  system  procurementa.  It  reJatee  program  schedule, 
decision  mikmoam  tam  management  sttuetuze,  and  test  tesourcee  to  critical  tschnical 
charactetistka^  ecitkal  opecational  isaoca  evaluation  criteria  and  proceduree.  It  is  used 
as  a  tool  for  overaight;  review,  and  approval  of  the  test  and  evaluation  effort  by  OSD 
and  all  DoD  oomponeatn  Hie  TEMP  as  rfferihii  ia  DoD  5000.3^*1  is  brief  by  directive 
and  explicitly  covem  the  system  requirementa,  program  summary,  DT&E,  OT&E,  and  test 
and  evaluation  teaouroeB.  The  initial  TEMP  must  be  submitted  to  ODDDRE(T&E)  prior  to 
\Clescone  I  and  updated  at  least  awwmiiy  thereafter.  In  summary,  the  TEMP  is  viewed 
as  a  living  document  throi^hout  the  acquisition  cycle,  outlining  the  roles  of  DT&E  and 
OT&E 


Two  major  weakneaaeB  were  observed  in  reviewing  the  nde  of  teet  and  evaluation  in 
the  procem.  The  first  obaervation  wae  that  there  is  a  need  to  have  the 

OT&E  community  pertidpete  throughout  the  entire  acquisition  process,  not  just  at  the 
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end  (as  related  to  the  "late”  discussion  earlier).  The  second  observation  was  that  the 
test  and  evaluation  efforts  rely  almost  solely  on  test  results  which  often  do  not  arrive 
until  late  in  the  development  cycle  when  design  changes  are  costly  (as  related  to  the 
"rigid”  discussion  earlier).  These  observations  can  be  addressed  and  their  effect 
mitigated  by  policy  changes  and  increased  use  of  mndgiing  and  simulation. 

Participation 

It  was  observed  that  the  OT&E  community  is  not  heavily  involved  in  the  acquisition 
process  until  near  the  end  of  the  program.  As  described  earlier,  the  Task.  Force  felt 
that  this  is  too  late  to  be  of  benefit.  A  weapmi  system  tends  to  meet  its 
specifications  yet  at  times  fails  its  operational  tests.  This  is  a  fundamental  weakness  in 
the  acquisition  process  and  is  attributed  to  unforecasted  and  perhaps  unforecastable 
changes  in  the  threat  or  environment.  Early  and  continual  involvement  of  the  OT&E 
community  (or  its  function)  could  help  mitigate  the  effect  of  these  surprises,  reduce 
weapon  system  costs,  and  yield  more  effective  weapon  systems. 

TMt  Eioohasis 

It  was  observed  that  the  test  and  evaluation  community  places  a  heavy  emphasis  on 
test  and  a  light  emphasis  on  evaluation.  Test  and  evaluation  are  interrelated  and 
complementary  proceseM,  both  of  which  axe  necessary,  neither  alone  is  sufndent. 

Evaluation  can  be  used  to  judge  overall  system  performance  against  the  operational 
mi— inn  requirement!  and  to  reassess  performance  as  the  *"«—«"«  requirements  and 
system  design  evolve.  This  evaluation  is  supplemented  by  test  reaulta.  Evolving  analytical 
models  and  simulations  can  help.  A  consismnt  and  traceable  set  of  evaluation  tools  could 
be  used  throughout  the  acquisition  cycle  to  help  mitigate  surpriaeB  encountered  during 
operational  testa.  The  framework  for  the  test  and  evaluation  procese  could  be 
rffienmmit-i  and  Updated  in  the  TEMP.  Modeling  and  simulatinn  could  play  a  major  role 
in  improving  the  evaluation  prorras 

Ha  Sols  gc  ia  oa  ^«p“***”  Exosn 

Modeling  and  simulation  (M/S)  is  used  eaxenaively  throughout  the  acquisition 
procean  M/S  is  at  times  used  in  establishing  the  mission  requixementn  in  designing  the 
weapons  system,  and  in  farecasting  the  weapon  system  oostai 


Uaa  god  JjXB  flC  SiSIttlASiflBS 


t^thin  the  establishment,  the  body  of  ■imniatiim  users  within  the  Services 

and  among  defense  oantnceon  is  large.  These  include  Defense  ^endes,  Service 
labotatotien  and  school^  research  oentexa^  analysis  divisions  program  offices  (PM^SPOsX 
canttacton  and,  nuce  recently,  the  Service  Operational  Tent  Agencien  (OTAsX  Indeed, 
the  list  of  thOM  oeganizadons  not  using  ahnulations  is  probably  quite  ahon.^ 


^If  aimulationa  to  support  ezerdses,  training  watgaming  were  considered,  moet 
of  the  fleld  amwmaiMia  -wottld  bt  included,  the  list  of  non-usna  vaniahingly  amalL 
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Despite  this  myriad  of  users,  simulation  types/applications  and  the  differences 
between  Service  organization/ procedures,  some  generally  applicable  observations  can  be 
drawn  of  the  acquisition  process: 

1 

Deficiency  Identification 

Mission  Requirement 

♦ 

Requir^  Operational  Characteristics 

Functimiai/Technical  Specification 

Full  Scale  Oevelopment 

Development  Testing 

Operational  Testing 

Pnxluction  A  Relding 

■  Upgrade 


Tlis  process  of  identifying  developing  rniiMifin  requirements  ia,  of 

neceeeity,  eupported  by  limited  end  imptedae  data.  It  is  baaed  largely  on  "Cheotetical” 
data  which  indudas  estimatee  of  technological  advances  and  their  military  applicability  as 
well  as  estiauttsi  of  threat  advances  and  potentiiL  ’’SsioxiGal’*  data  •  peat  combat 
leMlei  and  pexfonnanoc  of  current  eystemi  •  are  also  available.  While 
phydcd/engineeiing  aimulationa  play  an  important  role^  tmtiaiatirm  of  thia»  data  into  a 
minrion  leval  tequiremant  dtwiMia  nw  h^her  level  ’’model’*  be  used.  If  «imui 
are  used,  they  have  been  qpeiational  (or  higher)  level  aimulaticma^  Typiolly  they  have 
alao  been  analytic  m  natnn.  Ihe  mme  models  dao  typically  eupport  the  tradeoff 
analydi  in  CDEAa^  whidi  aopport  the  Mileetone^  I  dedaion  at  which  time  a  ’’preferred’* 
system  approach  ia  mieeied  and  its  required  operatianal  characterieticB  are  first  defined. 
Particularly  if  hBhnwnae  0  and  I  are  not  combined,  prototype  or  demoncttation/lab  data 


^  aoBW  aequiaitiaa  progtame,  documented  use  of  eimulation  to  identify 
requirementi  ia  ahaeat. 

^Coac  and  Operational  Effectivenem  AnalyHei 

^Defanaa  Acquidtion  Board  (DAB)  program  dedaion  poiBt  approving  to  proceed  with 
next  life<yde  phaw,  DoD  batmction  SOOOJ. 


may  also  be  available.  The  mission  area  analyses  and  COEAs  that  support  the  process  of 
requiremena  are  the  responsibility  of  the  Service  staffs  -  Le.,  the  proponents  of  the 
system. 

As  the  progrw  moves  through  concept  demonstration/validation  and  the  program 
office  becomes  established,  simulation  tends  to  drop  down  to  the  system  or  engagement 
level  (ie.,  one^-^me).  As  technical  performance  data  become  available, 
models  ate  refined  and  then  used  as  a  development  tooL  These  system  or  engagement 
level  models  are  often  used  to  quantify  the  government’s  Request  for  Proposals  CRFP). 
Contractor  proposals  use  s]rstem,  engineering  and  cost  A  set  of  t*ehnie«i 

specifications  for  full  scale  development  results.  After  Milestone  I.  mmniatinn  use 
resides  increasingly  in  the  program  office  and  more  significantly  the  contractoi(sX 

After  the  technical  spedfications  have  been  in  a  contract  and  FSD 

progresHS,  the  majoiity  of  sinmlation  su^orts  the  attainment  of  spedfications 
rather  than  addresnng  the  misnoti  requirement  directly.  Uncertainty  as  to  how  well  the 
mission  requirement  is  being  met  can  develop  if  the  FSD  system  is  not  fully  meeting  the 
technical  apedficationSi  System  and  engineering  level  simulation  dominates.  Manned 
simulators  are  used  extensively,  especially  in  the  aircraft  industry.  Hardware*in-the-loop 
simulatinns  are  also  used,  parriculatly  in  sensor/ECM  programs.  These  simulators  may 
also  be  used  in  supporting  development  testing.  Engagement  level  modeling  jg  not  that 
uncommon,  particulsrly  if  an  engagement  level  requirement  is  in  the  contract  (e.g., 
MLRS/TGW  requirement  of  defeating  X  percent  of  a  with  a  salvo). 

As  the  program  progremes  to  the  operational  test  pt^f,  simuiatitw  cantinm  to  be 
used.  Since  operational  tests  are  conducted  in  the  field  by  troops^  live  munitions  cannot 
be  used.  Therefore,  simulators  replicate  firings  (e.g.,  lasers  and  detectors) 

■limulaTions  provide  ncUl**  lemltn  These  may  be  very  detailed  engineering 

models,  such  as  miasda  fly-out  models  or  ublen  which  have  been  genereted  by 
vulnerability  aimulationa  off-liiu  to  the  test.  The  fielded  threat  in  operational  tests  also 
consists  of  "simulstois”,  either  U.S.  systems  aS  varying  Hdelity  to  those  of  the  threat  or 
specially  deaignsd  manned  aimulatota  used  as  threat  surrogates.  Thus,  a  substantial 
amount  of  field  data  is  baaed  on 

The  Service  operational  teat  agencaea  (OTAa)  that  have  the  reeponaibility  cf 
conducting  testa  have  treditionally  not  developed  or  umd  simulations  in  their  evaluationa. 
In  the  peat  two  years,  however,  this  hss  been  eb«"g'"g  at  the  Director  of  Operstionsl 
Test  end  Evalnation  has  ban  urging  that  operstionsl  smemrants  bs  mads  esziy,  well 
before  the  xm  itmlf.  Since  this  is  a  relatively  new  effort  and  the  OTAs  have  limited 
remurces^  promiimt  examples  of  aimuUtiaa  use  in  support  of  early  opeistional 
aaMmmnti  ere  leclring. 

While  tlM  above  dmpUfkd  Ghaisctexizatian  ^pUee  to  the  ’*typicel’*  system,  some 
syitexun  at  OSD  dtrecrinn,  are  also  tested  and  eviluetad  in  joint  tarn  and  opetstionel 
utility  evs tuitions  COUEi}.  Soms  of  these  ectivitiee  um  rimniatinw  extensively.  For 
example^  tbs  AMKAAhf  OUE  was  oonductsd  on  nsttad  «««««•«»  aimulstois. 

CredibmtT  rf  UnA.I«  .tirf  MmaUrimw 

As  the  uas  of  dmulatinns  has  incressed  in  Dtdli  tbs  themselves  have 

grown  in  ala  and  complexity.  In  fact,  suat  aimulsttane  that  appear  in  the  acquisition 
proBsm  srs  too  oomplicstsd  to  bs  sufOcintly  underetsndshle  by  dedaioamskeis  in  the 
time  eveilehl%  leading  to  serious  concerns  of  validity.  It  is  heroming  ever  more 
laopormnt  to  ansute  that  tbs  xssalti  of  dmulstions  used  to  nppott  najor  acquisition 
deddOM  SIS  isUsbIs,  and  that  they  do  not  pom  inordinats  riaka.  Sinca  aimulationa  may 


be  the  only  practical  evaluation  tools  available  for  certain  acquisition  decisions,  steps 
must  be  taken  to  assure  a  justifiable  measure  of  confidence  in  the  results  provided  by 
such  models  and  aimulationa. 


An  essential  attribute  of  every  truly  useful  model  or  simulation  is  that  it  has  earned 
a  high  degree  of  credibility;  its  construction,  execution,  and  the  interpretation  of  results 
are  considered  to  be  "good  and  true**,  taken  in  the  proper  context.  In  several  previous 
studies  of  modeling  and  simulation,  the  credibility  of  a  simulation  has  been  identified  as 
the  key  determinant  of  utility. 

System  and  subsystem  models  at  the  engineering  level,  particularly  those  which 
model  functions  that  axe  exercised  in  seem  to  enjoy  a  fairly  high  level  of 

credibility.  For  example,  it  is  virtually  impossible  to  imagine  a  modem  aircraft 
development  program  that  would  not  make  extensive  use  of  wind  tunnels  and  flight 
dynamics  aimuUtiftn*.  it  has  been  shown  that  these  frequently  permit  a 

reduction  in  the  number  of  flight  hours  required  during  the  development  process.  This 
high  level  of  simulation  credibility  can  be  attributed  to  the  degree  of  understanding  of 
aerodynamics  (at  least  empirically)  and  the  use  of  instrumented  flight  teat  data  to 
continually  improve  the  fidelity  of  such  simulationai 

Complex  combat  wmuUtinri*  which  estimate  opexatioosl  performance  at  the  force-on- 
foroe  level  (some  would  also  argue  at  the  one-one-ooe  engagement  leveO  naturally 
encounter  a  great  deal  more  akeptidsm,  since  these  high  level  models  must  neccssaxily 
make  simplifying  assumptions  and  saerifice  detaiL  Cmttibutii^  to  this  distrust  is  the 
fact  that  the  fundamental  theoretical  bases  for  the  simplifications  axe  leas  well 
understood  CLancheeter’s  equations  hardly  inspire  the  confidence  of  MazweU*aX  Hie 
importance  of  human  performanoe  factors  is  an  additional  complication.  It  could  be 
argued  that  only  actual  combat,  with  instrumentation  to  collect  data,  could  fully  reaolve 
all  tbs  auspeec  elements  of  timulatinn.  Use  of  multiple  models  with  different  theoretical 
approachee  and  aasumptiona  may  provite  a  hedge  againat  the  uncertainty  of  our 
fundamental  knowledge  of  combat  pronfea.  Them  different  appraechee  and  aasumptions 
muat  be  made  deer,  however,  elae  the  different  modele  ere  likely  to  generate  more 
oonfusicBi  *****  iiuight. 

Althot^  many  attempts  have  been  made  to  develop  ptocedoxee  for  assesting  the 
credibility  of  a  model/aimulatian,  none  have  gained  wideapteed  eoceptanoe.  At  the 
preaent  time,  there  is  no  policy  or  process  in  place  in  DoD  to  amm  the  credibility  of 
specific  twQrf*!*  aimaiatinM  tg  be  uaed  in  the  test  and  evaluation  and  the  acquisition 


In  g******!,  ths  Talk  Force  notee  that,  due  to  the  vexiability  of  applications  and 
different  ianes  to  be  addremad  by  aiiBulatiaa  runs  of  ■i***!**  modela  it  ii  unrealistic  to 
attempt  to  "ecMsdiir  ibx  **"g**  mffMil  for  more  than  opfi  specific  application  scenarioi 
This  is  not  to  aej,  however,  that  a  more  rigorous  evaluation  ptocese  of  model  simulations 
is  not  «fr*r*t  it  iii  Ftimisr,  becsnsi  of  the  wide  diversity  of  medals  end  aimulstion 
it  is  uareasnnibls  to  expect  that  any  alngle  "jaccreditsdan**  agency  could  be 
effective  in  evalnating  every  model  ptopowd  by  egendsi  in  support  of  arguments 
xelstsd  to  aequisitiflB  **?***■***—-  Indeed,  the  Teak  Force  holds  that  the  credibility  of  t 
or  tirnttlation  hse  nnT***irg  only  in  the  context  of  the  models  eppUcation  to  a 
spedfie  proUsm,  and  zefleca  not  only  the  integrity  of  the  modal  formulation  itmlf,  but 
the  validity  of  the  input  data  and  the  qualiflcetions  of  thoae  executing  the  model 
and  thw*  interpreting  the  xemln.  In  abort,  the  entire  procaae  that  uase  tbs  reeulti  of 
aimulatioas  in  arriving  at  ooncludona  ahould  be  subjected  to  mrutiny;  the  approval  of 
any  aabeec  of  procedural  stepe  is  insufficient  to  amuxe  the  credibility  of  xenla. 
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Most  of  the  large  siinuiatioii  models  used  in  DoD  contain  one  or  more  of  several 
"building  block"  submodels  or  databases  developed  independently  by  executive  agents,  such 
as  the  nuclear  effects  submodels  formulated  by  ONA  and  threat  submodels  constructed  by 
DIA.  The  credibility  of  these  special  purpose,  reusable  simulation  modules  derives 
primarily  from  confidence  in  the  originating  agency,  measures  nbo^ld  be  to 
support  the  currency,  maintenance,  and  appropriate  use  of  these  submodules. 

In  contrast,  the  isnie  of  credibiUty  must  be  addressed  anew  for  each  unexplored 
application  based  on  the  larger,  more  complicated  ^n^yiiejg  involving  interactions 

beyond  well-understood  physical  laws.  Many  of  the  more  complex  Twnri»i«  have  earned 
respect  over  time;  however,  caution  is  appropriate  even  for  these  cases  since  such  models 
ate  used  by  disparate  groups  who  may  not  be  intimately  familiar  with  the  original 
constructs  and  assumptions  of  such  models.  The  Task  Force  found  no  standard 
process  for  ensuring  credibility.  As  warranted  by  the  import  of  the  dedsion  involved, 
close  scrutiny  of  the  modfling/simulation  process  its  application  to  the  question  at 
hand  is  needed,  and  may  be  best  carried  out  by  an  independent  panel  of  experts,  selected 
on  a  case-by-caae  bests. 

Although  model  simulations  ate  frequently  cited  in  support  of  proposed  major 
acquisition  programs,  there  is  currently  no  prescribed  reporting  requirement  that 
adequately  addressee  the  uee  of  modeling  *"d  t«  »in>  pwy 

eithn  prior  to  or  during  the  development  ptiesre  The  Talk  Fbtce  beUsvea  this 

simple  omission  detracts  from  the  credibility  of  properly  employed  nitwn  The 

planned  use  of  modeling  and  be  reported  in  the  mii— 

documentation  provided  the  DAB,  at  least  in  summary  form,  •i«^g  with  the  methodologies 
of  application  and  interpretation,  a  description  of  applicable  limitations, 

assumptions,  extrapolations,  and  sensttivitien 

Recent  advances  in  technology  have  acoeiecated  bsoedened  the  application  of 
modeling  end  eimuletiotL  DevelopmentB  in  the  oamputer  ■ei«w<e—  do  much  to  promote 
the  stendeidiTition  of  model  simulation  auftwaze  itmctuxen  datataasa 

interfaces  and  lenguagen  By  embracing  many  of  them  new  techniques  end  guidelines, 
the  proper  operation  of  raw  he  more  eesily  ensured.  Further, 

eveluetion  methods  can  be  made  more  powerful  end  teaanuing,  rf<"g  to  e  more 
accurate  eaMssment  of  the  credibility  of  m****!^  end 

In  general,  the  Task  Force  hai  identified  four  eicas  of  concern  regarding  tnnrf^Hng 
and  limulation  credibility: 

a)  Preaerving  the  credibility  of  specialized,  reuaeble  model  elements^ 

b)  Evalueting  the  credibility  of  lerge-ecele  ueed  in  aoquieitian 


e)  Perilltarinf  ameaBnente  of  credibility  by  proper  repotting  of  M/S  methodology 
in  ^propriate  DAB  docomentatian,  and 

d)  Eaphdting  new  technologiee  for  improving  the  cxedibiliby  of  timuletiona. 

Modeling  end  atmulation  can  be  an  effective  tool  in  the  procem,  if  used 

throughout  the  aywem  life  cycle.  Modeling  and  ■iwmuttow  oontzibute  knowlei^  end 
undsrsianding  of  syatam  end  mindon  requirement^  system  effectivenesn  and  comb 
rmnlting  from  aequiaitioa  deciatona,  The  Teak  Force  found  that  inereaaad  M/S  effort  in 


the  following  broad  acquisition  areas  could  be  helpful  to  modeling  and  simulation 
concerns: 

Miaiaa  Reauirement  Generation 

M/S  is  used  to  establish  mission  requirements  because  it  is  generally  the  only  tool 
available  in  the  genesis  of  a  weapon  system  acquisition.  Figure  2-3.  The  use  of  M/S  to 
establish  mission  requirements  varies  from  program  to  program  and  from  Service  to 
Service.  Two  basic  forces  drive  changes  in  mission  requirements.  The  first  is  the  push 
of  advancing  technology  that  make  more  effective  weapons  possible,  and  the  second  is 
the  pull  of  the  evolving  threats.  Both  of  these  forces  drive  changes  in  mi-«»tinn 
requirements.  The  use  of  M/S  to  forecast  the  threat  ten  to  fifteen  years  in  the  future 
is  very  difficult  and- riddled  with  vagueness.  Also,  the  use  of  M/S  to  predict 
technological  advances,  although  baaed  on  years  of  extensive  research,  is  often  quite 
imprecise.  On  the  other  hand,  rarely  do  we  predict  the  appearance  of  the  truly 
revolutionary  system  (Jut^  atom  bomb). 

To  effectively  set  mission  requirements  it  is  necessary  to  model  mote  than  the 
weapon  system  capability  of  interest  as  shown  in  Figure  2-3.  It  is  often  necessary  to 
model  many-on-  many  systems,  foroe-on-force  engagements,  amwatim—  nation- 
versus-nation  eng^ementa.  Modeling  greater  groupings  of  forces  results  in  increased 
levels  of  aggregation  with  resulting  loases  in  predaian  and  certainty.  Once  the 
requirements  are  aet  the  mode'r  are  often  set  by  the  wayside  until  the  "***  technology 
push  or  threat  pull  appears. 

Proper  M/S  use  can  provide  more  illumination  of  choicea  in  the  operational 
requirement  proccan  In  the  beginning  of  the  ptoccaa  operational  requirements 

are  driven  by  advancing  technology  and  evolving  threat.  M/S  ia  ^he  only  tool  available 
to  evaluatt  the  full  ambit  of  force  structuro;  doctrinea,  weapon  aysteme,  and  thxeata  to 
guide  weepon  system  procurement.  The  requirements  prooesa  is  by  nature  inexaa 
becaues  it  ie  difficult  to  forecast  the  threat,  technological  advancement,  doctrinal 
changaa,  and  coMs  ten  to  fifteaa  yean  in  the  future.  A  good  requirements  evaluation 
must  include  eicuwion  analysis  in  many  dimensions  to  properly  account  for  and  bound 
the  uneertsinties  in  the  aamunptiona.  It  is  also  important  that  then  models  reflea 
reality  n  much  as  ponible.  For  example,  to  properly  inveetigata  doctrinal  changn  it 
may  be  necessary  to  introduoa  people  in  the  loop  of  the  aiimriatinM-  The  M/S  developed 
in  the  conceptual  stags  of  an  acquiaition  abould  be  viewed  ss  a  tool  that  will  be  further 
developed  and  expanded  throughout  the  development  cycle. 

ivvinpttMwit  DBBgn  lioai 

M/s  is  uasd  most  extsnaively  ss  a  development  darign  tooL  M/S  design  tools  often 
mature  throo^ost  the  development  proms,  often  supplemented  by  development  test 
leeulta.  M/S  Is  oasd  nrly  ia  the  development  process  for  initial  performance  evaluation 
and  nm  asdsr  system  sizing  Later  in  the  development  procem  M/S  is  used  for  detailed 
system  sixing  ead  deriga  vetificatioa.  Near  the  end  of  a  development  program  M/S  ia 
used  to  sapplnatat  the  development  test  program  to  analyze  regions  of  the  envelope  that 
are  either  impossibls,  oomly,  or  unmfe  to  inveetigata  through  taning.  M/S  ii  used  very 
effectively  as  a  deriga  deveh^msat  tool  by  the  developing  oontractoie. 

Proper  use  of  M/S  can  provide  mors  flexibility  in  the  development  part  of  the 
prome  The  development  portion  of  the  acquisitioD  procem  is  often  viewed 
at  too  rigid  beciuae  of  strict  des^  requirements  embodied  ia  the  technical 
•ptrifkationa  The  undtrlying  goal  of  weapons  system  procurement  ii  to  acquire  end 
field  effective  sad  eupportable  systems  The  M/S  tools  developed  ia  the  operation 
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requirements  pbaae  could  be  utilized  to  evaluate  the  relative  applicability  of  the  tgehniri.i 
specification  throughout  the  development  cycle,  which  is  often  five  to  ten  years.  This 
would  allow  revision  of  a  technical  specification  in  the  development  phase  when  the 
revision  will  3rKl<l  a  mote  effective  or  more  supportable  weapon  system.  This  *ho“ld  not 
be  viewed  as  a  license  to  continually  revise  the  technical  specification,  but  as  a  Insert 
mechanism  to  revise  a  technical  specification  that  originally  may  have  been  overly  rigid. 
The  M/S  tools  developed  for  operation  requirement  generation  will  need  to  be  expanded 
and  refined  to  accomplish  these  goals. 

Csat  Forecasting 

M/S  is  used  throughout  the  acquisition  cycle  to  forecast  weapon  system  coats.  The 
credibility  of  these  models  has  historically  been  suspect,  as  is  evident  in  the  press. 

There  are  some  basic  reasons  for  the  lack,  of  credibility  in  cost  prediction  models.  Cost 
models  are  reliant  on  a  vague  set  of  assumptions  in  tha  beginning  of  process.  Early 
in  the  acquisition  cycle  there  is  often  no  clear  description  the  weapon  system  and 
definitely  no  design  details.  This  is  compounded  by  vast  uncertainties  in  the  value  of 
the  dollar  ten  to  fifteen  years  downstream.  Most  weapon  s3^stem  developments  involve 
some  degree  of  technical  risk  that  further  complicates  development  cost  predictions.  All 
these  factors  and  more  contribute  to  cost  forecasting  a  very  difficult 

analogous  to  forecasting  the  threat.  Frequently,  the  office  cf  management  and  budget 
dictated  planning  factors  for  cost  — that  will  out  weigh  all  other  eigmenta  of 
cost  factors. 

It  is  evident  that  M/S  plays  a  large  role  in  the  process  but  the  Task 

Force  believes  that  its  use  could  be  improved  to  further  e«ha«ca  weapon  system 
procurement.  M/S  could  be  mote  effectively  applied  to  provide  mote  illumination  early  in 
the  acquisition  process  to  provide  more  flexibility  in  the  middi*  cf  the  acquisition 
process,  and  to  provide  mote  consistent  utility  evaluation  throughout  the  entire 
acquisition  procesa. 

Proper  use  of  M/S  can  be  used  to  provide  a  mote  cotudstent  utility  evaluation 
throughout  the  entire  acquisition  cycle.  Often,  the  operational  utility  of  an  acquisition 
program  is  evaluated  twice  in  it’s  acquisition  cycle.  The  early  evaluation  is  during  the 
mission  requirement  and  required  operational  capabilities  of  the 

process.  The  final  evaluation  is  during  the  operational  test  — of  the  acquisition 
process.  The  length  of  time  between  the  former  and  latter  is  normally  between  six  to 
ten  years.  A  lot  can  change  during  that  time  span  including  evolving  threats,  new 
technology,  end  changing  economic  fectori  that  could  change  the  utility  of  the  weapon 
system  in  development.  A  onnsistent  evaluation  Ga,  uae  of  the  same  M/S  tools  and  or 
tests)  of  the  operational  utility  Qa,  effectiveness  and  supportsbility)  throughout  the 
development  cycle  would  enlighten  key  decision  makers  during  the  budgeting  process. 

This  fonaistmt  svalnatlon  procsss  could  be  used  to  revise  technical  speciflcetioni  when 
appropTists  Can  stated  above)  or  to  curtail  or  cancel  developmeat  programs  that  provide 
Uttle  geini  in  operational  utility.  If  M/S  can  shorten  the  acquisrion  process  the  enemy 
is  denied  the  oppononity  to  fleld  counter  measures  inside  our  production  cycle. 
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SECnON  3  -  CONCLUSIONS 


Baaed  oa  these  findings,  the  Task.  Force  came  to  the  following  conclusions: 

a.  On  the  role  of  models  and  simulations  in  the  evaluation  process; 

1.  The  foundation  of  the  acquisition  process,  the  operational 
requiretnent  and  its  translation  into  system  terms,  can  be  improved. 

2.  Early  and  continual  involvement  of  the  OT&E  community  can 
contribute  to  and  improve  the  acquisition  process.  Modeling  and 
aimuurinn  can  assist  in  this  effort. 

3.  A  development  program,  as  embodied  in  a  specification  and  contract, 
may  become  overly  rigid,  restrictii^  the  willingness  to  evaluate  and 
incorporate  changes  as  thmt,  technology  and  knowledge  evolve. 

Modeling  and  can  be  used  as  a  tool  for  continual 

evaluation  of  potential  changes. 

4.  Accounting  for  human  performance  early  in  systems  acquisition  improves 
system  capability  and  enhances  the  ability  of  the  test  and  evaluation 
process  to  predict  operational  performanoe. 

b.  On  the  credibility  of  models  and  ■imuiaHrma; 

1.  .Cimuiarifttw  used  by  DoD  have  proliferated.  Frequently  they  are 
incomplete,  too  large  at  not  thoroughly  understood,  llieir  credibility  is 
questioned. 

Availability  of  high  quality,  reusable  M/S  elements  could  decxeasB 
redundant  efforts  while  improving  quality  of  key  elements. 

2.  A  process  does  not  exist  to  provide  an  independent,  objective  evaluation 
of  M/S  utilization. 

The  credibility  of  a  M/S  cannot  be  considered  separately  from  its 
application  to  a  spedflc  problem,  the  validity  of  data  inputs,  and  the 
of  executing  the  "*"***^  and  interpreting  the  resulta. 

3.  Curzent  DAB  documentation  doee  not  addzeae  model  and  aimulation 
credibility. 

4.  Advancing  technology  in  both  hardware  and  aoftwaxe  offer  opportunities 
for  improving  the  ciedibili^  and  applicability  of  M/&  Continued 
xcaearcb  is  needed. 
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SECTION  4  -  RECOMMENDATIONS 

Baaed  on  this  evaluation,  the  following  recommendations  are 

a.  For  the  role  of  models  and  simulations  in  the  evaluation  process; 

1.  The  Service  acquisition  executives  should  ensure  that  M/S  excursion 
analyses  are  applied  systematically  to  help  reach  and  maintain  agreement 
on  major  aspects  of  requirements  and  system  performance. 

2.  The  DOT&E  and  service  OT&E  erimmimiti—  should  be  chartered  to 
participate  early  in  the  requirements  process.  In  particular,  they  should 
translate  operational  requirements  into  an  evaluation  framework  and 
document  the  roles  for  M/S  as  well  as  for  testing  in  meeting  evaluation 
objectives  at  each  milestone  and  as  appropriate  between  miii»<i6nn»«. 

3.  The  USDCA)  should  establish  policy  »nd  provide  guidance  to  the 
acquisition  community  for  systematically  reevaluating  system  specifications 
using  M/S  and  test  results. 

4.  Service  acquisition  executives  should  ensure  that  the  development 
programs  employ  man-in-the-loop  simulation  beginning  sirith  requirements 
definition  and  mature  the  simulation  along  with  the  hardware  throughout 
the  acquisition  process. 

JCS/CINCS  should  exploit  technology  capabilities  in  distributed  computing 
and  networking  to  simulate  coordinated  combined  arms  engagements  with 
man-in-the-loop  simulations  and  to  evaluate  results  against  live  aTittciam. 

b.  For  the  credibility  of  modalf  and  mmulationa; 

1.  USDCA)  should  ensure  refinement,  maintananen  end  availability  of  modaia, 
weapon  and  threat  data  descriptions^  aod  aimu  lation  elements  having  wide 
DoD  utility. 

Appropriate  JCS  and  OSD  officm  ahouU  select/fund  executive  agents 
to  msintain  element  xepoaitOTies' (DNA-^uclear  models,  DIA-Thxeat 
date,  JTOQs,  etcj  complete  with  databaMS,  code  libraries,  and 
documented  limitations^ 

2.  USDCA}  should  charter  DDR&E  to  enable,  as  neocasary,  independent  panels 
of  experts  to  amesi  specific  spplkatione  of  M/S  renilti  on  which 
soquMtiaa  decidons  ase  heaed.  The  work,  would  be  tasked  on  a  caae-by- 
cses  baaiB  and  include  pextidpente  from  academia,  industry,  and  the 
government. 

3b  USDCA)  should  modify  DODI  S000l2  to  require  that  DAB  documenution 
(SCP/DCP,  TEMP,  ODEA  and  CAIO)  the  applicability  of  models 

and  aimulat^/wi.  Pof  immpl^  thf  OOttld  the  M/S 

plan  and  methodology,  limitatioim  amumptions,  extrapoUtiona, 
aeniitivitisai,  remits,  analyiia,  sad  validation. 
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DDR&E  should  continue  to  fund  M/S  technology  at  both  the  fundamental 
and  application  levels,  including  the  M/S  interfaces  and  languages, 
eiecuuble  specifications,  model  interoperability,  validation  rj!chniqii>«  and 
tools,  and  parallel  and  networked  rimuiatifinit- 
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THE  UNDER  SECRETARY  OF  DEFENSE 


ACQUISITION 


WASHINarON,  DC  20301 


3  0  MAR  1989 


MEMORANDUM  FOR  CHAIRMAN,  DEFENSE  SCIENCE  BOARD 

SUBJECT:  Terms  of  Reference  -  Defense  Science  Board  (DSB)  1989 
Summer  Study  Task  Force  on  Improving  Test  and 
Evaluation  Effectiveness 


You  are  requested  to  form  a  Summer  Study  Task  Force  to 
examine  the  contributions  of  modeling  and  simulation  to  Defense 
tes:  and  evaluation  so  as  to  Improve  the  acquisition  process. 
Collateral  tasks  should  examine  ways  to  Increase  credibility, 
realism  and  objectivity  in  the  test  and  evaluation  process. 

This  study  should  delineate  the  benefit  to  be  derived  from  the 
timely  use  and  role  of  validated  models  and  simulations  In  lieu 
of  threat  targets  and  environments  due  to  practicality, 
security,  International  treaties,  and/or  civilian  encroachment 
considerations.  The  expected  outcome  should  be  test  and 
evaluation  Initiatives  required  to  deflnltlze  the  scope  and 
fidelity  of  testing  required  tot  (1)  support  the  quality 
production  of  defense  systemsi  (2)  evaluate  and  reduce  the 
uncertainties  associated  with  defense  system 
acquisitlon/productlon  declslonsi  and  (3)  support,  evaluate  and 
reduce  the  risks  associated  with  Introducing  new  technologies 
into  defense  weapon  systems  acqulsitlon/produotlon  decisions. 

The  perception  persists  that  less  than  credible,  realistic 
and  objective  test  results  are  obtained  through  the  use  of 
models  and  simulations  In  lieu  of  threat  targets  and 
environments.  At  the  same  time,  system  costs  and  real  world 
test  restrictions  have  Increased  our  dependence  on  models  and 
simulations.  Task  Force  efforts  should  Include: 

-  Review  prevailing  use  of  models,  laboratory  teats  and 
field  tests. 

-  Determine  appropriate  situations  in  which  to  test  and/or 
model. 

-  What  fidelity  of  representation  of  the  system  under  test 
and  its  environment  is  required? 

-  What  discipline  governs  interpretation  of  results  and  Its 
application? 

This  Task  Force  will  be  sponsored  by  the  Office  of  the 
Deputy  Director  Defense  Research  and  Engineering  for  Teat  and 
Evaluation  (ODDDRE(TAE} } .  Brigadier  General  Robert  A.  Duffy, 


USAF  (Ret.)  has  agreed  to  serve  as  Task  Force  Chairsan.  The 
Executive  Secretary  will  be  Colonel  Matthew  M.  McGuire,  USA, 
ODDDRE(T&£/WSA} ,  and  the  DSB  Secretariat  Representative  will  be 
Commander  George  A.  Mikola! ,  USN.  It  is  not  anticipated  that 
your  inquiry  will  need  to  go  into  any  "particular  matters” 
within  the  meaning  of  Section  206  of  Title  16,  United  States 
Code. 


The  Terms  of  Reference  for  this  Task  Force  include  no  assignments 
to  the  Task  Force  that  would  indicate  the  Task  Force  would  be 
participating  personally  and  substantially  in  the  conduct  of  any 
specific  procurement  or  place  any  me^er  in  the  position  of  acting 
as  a  "procurement  official".  1/  i 

OGC:  ,/  / _ 


Date!  1 7- 
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APPENDIX  C 


BRIEFINGS  PRESENTED  TO  THE  DSB  TASK  FORCE 
CHf  IMPROVING  T&E  EFFECTIVENESS 

Meeting  -  April  11.  1989 


Title 

Overview  of  Task  Force  Requirements 

Industry  Perspective  on  Improving  Test 
and  Evaluation  Effectiveness 

JCS  Perspective  on  Test  and  Evaluation 
Modeling  and  Simulatinn 

SDI  Program  Strategy 

DOT&E  MndoHttg  and  .Simulation  Concerns 


Presenter/Oryaniration 

Kimmel/OUSDRE  (DDTE) 
Gansler/TASC 

Roske 

Bleach/SDIO/OSD 

Sanders/DOT&E 


Meeting  -  June  6/7,  1989 


Title 

Pr^ttr/OTyiniy.ation 

Deep  Hres  Requirement  St  Threat 

Reid/AMSAA 

MLRS-TGW  Program  Overview 

Reed/HQDA 

Accredation  Process  for  M&S 

Beavera/AJMSAA 

OOEA  Iubthodology  Overview 

Jonea/lEADOC 

Countenneasures  Mndeling/Analysis 

Palamo/LABOOM 

Engineering  Model  Validation 

Bradaa/MIOOM 

6-DOF  Submunition  Trajectory  Model 

Sanders/MICOM 

Drop  Tests  St  Comparison  to  Models 

Sandera/MICOM 

Hardwar»4ttrTlie-Loop  Simulation 

Cole/MICOM 

Captive  Flight  Tests  St  Dam  Analysis 

Bradas/MICOM 

Battlefield  Environment  Smulation 

Alongi/MICOM 

MLRS-TGW  Effecclvenaas  Mbdcll^ 

McClung/PMO 

Reliability  Growth  Modeling 

Foulkea/AMSAA 

Test  and  Evaluation  Mascsr  Plan 

Foulkes/AMSAA 

Software  DevelopBaaat/TVftV 

Holeman/SDJNC 

System  Trainaa 

Nally/Ft.  Sill 

Maintenanos  Tniasrs 

Blount/OMMCS 

VulnetabUity/LoHnUty  Method 

Kirk/BRL 

Modeling  in  Lie*  Rze  Test  Process 

OEryon/DDDRE 

BFVS  ModaUn^esting  Assessment 

O'Bryon/DDDRE 

AMSAAh  Role  in  TftE 

Rdd/AMSAA 

Technical  JEP/TDP 

King/AMSAA 

AMSAA  System  Pstf.  EvaL  Method 

Kin^AMSAA 

Sttpportabilicy  Analysis  Method 

Morton/AMSAA 

OTEA'b  RoU  in  TAB 

DuUn/OTEA 

ModaUng  and  in  Support  of  OT&E 

Dubin/QTEA 
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BRIEFINGS  PRESENTED  TO  THE  DSB  TASK  FORCE 
ON  IMntOVING  T&E  EFFECTIVENESS 

Meeting  -  July  11-13,  1989 


Title 

Use  of  M&S  in  OT&E 

Sirstem  Effectiveness  Modeling 

OSD  Cost  Modeling 

Use  of  M&S  in  Army  estimates 

Fleet  Ballistic  Missile- 

M&S  concept  paper  presentation 

M&S  in  Support  of  the  RDT&E 

V-22  Operational  Requirement 

V-22  Preliminary  Design 

V-22  Risk.  Reduction/Deaign  DeveL 

V-22  Design/Test  Support 

Avionica/Flight  Controls 

Facility/Manuf securing 

Threat  and  Vulnerability  Analysis 

Reliability  and  Maintain  Predictions 

Operational  Availability  Models 

Cost  Analysis  Models 

OOEA  M&S  Support 

Flight  Training  and  Simulators 

USAF  DSCS  Program  Overview 

Threat 

Sys.  SinL-Integrated  Test  Facility 
M&S  for  Ground  Segment  ^idtiing 
Nuclear  Enviromnent  &  Hueat  Sitn. 
Link  Effects  Evaluation  &  Mitigation 
Performance  Assessment  &  Validation 
DSCS  End-to-End  Performance  Model 


Dubin/OTEA 

Croteau/P  A&E 

Lee/PA&E 

Young,  CEAC 

Fisber/FBMO 

Horowitz 

Keii 

Schaefer/USMC 

Martin/Bell 

Taylor 

Gaffey 

BaUauer 

Hays 

Johnson 

Monje 

Venzlowsky 

Weathersbee 

Kusek 

Curtis 

Hanigan/DCA 

Hsrtigan/DCA 

Groener/Army 

Miletta/Army 

Wittwer/DNA 

Bogusch/DNA 

B(^usch/DNA 

Siffls/DCA 


APPENDIX  D  -  GLOSSARY  OF  TERMS 
AMRAAM  -  Advanced  Medium  Range  Air-to-Air  Missile 
CAIG  -  Cost  Analysis  Improvement  Group 
COEA  •  Cost  and  Operational  Effectiveness  Analysis 
DAB  -  Defense  Acquisition  Board 
DIA  -  Defense  Intelligence  Agency 
DNA  -  Defense  Nuclear  Agency 
DOT&E  -  Director,  Operational  Test  and  Evaluation 
DSB  -  Defense  Science  Board 
DT&E  -  Development  Test  &  Evaluation 
ECM  >  Electronic  Counter  Measures 
JCS/QNCS  •  Joint  Chiefs  of  Staff/Commandeis  in  Chief 
FSD  -  Full  Scale  Development 

MLRS/TGW  -  Multiple  Launch  Rocket  System/Terminally  Guided  Weapon 
M/S  -  and  Simulation 

0DDDR£(TA£)  •  Offlce  of  Deputy  Director  for  Defenae  Research 
and  Ei^ineering  (Test  and  Evaluation) 

OSD  -  Office  of  the  Secretary  of  Defense 

OTA  -  Operatianal  Test  Agency 

OT&E  -  Opermtioaal  Test  &  Evaluation 

PM  -  Program  Manager 

RFP  -  Requeet  for  PiopaMl 

ROC  -  Requisad  Oparatkmal  Capabilitiee 

SCP/DCP  •  SyMB  Oaneapc  ftper/Dedsian  Coordinating  Paper 

SPO  -  Syecam  Program  Offke 

TEMP  -  Test  Evaluation  and  Master  Plan 

USDCA)  -  Under  Secretary  of  Defenae  Acquisition 
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IMPROVING  T&E  EFFECTIVENESS 


TERMS  OF  REFERENCE  MANDATE 


TO  EXAMINE  CONTRIBUTIONS  OF 
MODELING  &  SIMULATION 
TO 

DEFENSE  TEST  &  EVALUATION 
TO 

IMPROVE  THE  ACQUISITION 
PROCESS 


IMPROVING  T&E  EFFECTIVENESS 


CHAIRMAN 

BGen  Robert  A.  Duffy,  USAF  (Ret.) 

mr  TH  TUT  *Q  o  c? 
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(Ret.) 
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IMPROVING  T&E  EFFECTIVENESS 


DDDRE(T&iE):  Dr.  H.  Steven  Kimmel 
DOT&E:  Dr.  Patricia  Sanders 
PAE:  Dr.  David  Gallagher 
DIA:  Mr.  Nick  Bennett 
DNA:  VAdm.John  T.  Parker 
Dr.  Leon  Wittwer 
EXECUTIVE  SECRETARY: 

Colonel  Matthew  M.  McGuire,  USA 


DCA:  Dr.  Robert  Lyons 
SDIO:  Dr.  Richard  Bleach 
ARMY:  Mr.  Arend  (Pete)  H.  Reid 
AIR  FORCE:  Mr.  Carroll  G.  Jones 

LtCol.  Robert  J.  Schwarz 

DSB  MILITARY  ASSISTANT: 

Colonel  Oliver  Westry,  USA 


Dr.  Gary  C.  Comfort 
Dr.  Robert  H.  Boling 
Dr.  Mark  C.  Zabek 


Mr.  Howard  J.  Harvey 
Mr.  Ramon  L.  Strauss 
Mr.  Edward  P.  Petkus 


Dr.  Ralph  Passarelli 


I 


E-2 


IMPROVING  T&E  EFFECTIVENESS 


Outline 

Acquisition  Process 

•  Role  of  T&E  in  the  Acquisition  Process 

•  Role  of  M/S  in  the  Evaluation  Process 

•  Credibility  of  M/S 
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IMPROVING  T&E  EFFECTIVENESS 


Terms  Of  Reference  Mandate 


To  Examine  ContiUxiUoiw  of 
Modeling  and  Stamiialion 

To 

Defense  Test  ^  Evskiallon 
To 

Improve  the  Acquisition  Process 


[Slide  1]  Our  panel  was  asked  to  look  at  modeling  udsimulanon  as  they  penain  to  test  and 
acquisitka  of  defense  systems.  In  the  course  of  doing  our  stndy,  we  discofveml  that  in 
order  to  make  useful  suggestions  on  "wt^iing  and  simulaDon  wt  had  to  make 
corresponding  leoomtnendations  in  the  areas  of  test  and  acquisition.  So  we  broadened  the 
scope  of  our  activity  to  include  all  of  these  processes.  That  makes  the  subjea  of  our  study 
pretty  cocaplex,  so  I  am  going  to  begin  the  briefing  by  providing  an  overview  of  the 
thought  process  used  for  reliiuig  all  of  diese  processes.  This  slmuJd  provide  the  context 
for  the  more  detailed  discossiQa  with  regard  to  the  study. 

We  ircogmmldtat  over  the  last  ten  yean  there  has  been  a  cootinually»increased  emphasis 
placed  on  the  use  of  special  goveromem  test  organizatioas,  ind^eadem  of  the  acquisition 
organiaaons,Pdefi^  conduct,  and  evaloaie  the  resulttofoperitiooalestswi  newly 
developed  systems.  This  has  been  done  to  hnpove  the  acquiadon  process  by  adding 
confidence  to  the  ptofrcdondecisioor,  those  buy,  no»boy  decisions  that  wei^  operational 
test  data  heavily.  Asigtuficantsideeftooftfaueiqtfaasisonibeoseofqpetatkmaltesimg 
as  an  audit  tod  after  development,  has  been  a  eorrespondiag  shift  in  the  ondook  of  the 
developmem  community  on  the  overall  itde  of  opennonal  testing.  This  diange  has  been  to 
de>emphasize  the  use  of  operational  testing  as  a  learning  tool  during  development.  Thatis, 
something  the  developer  does  while  designing  a  sysra  to  better  understanding  the 
conqrlica^interrelanonshy  between  the  spedficuions  fan  ystem,  the  design  for  a 
system,  and  the  ultimate  opetadonai  utility  that  is  being  soo^  Why  does  ddscorreiadon 
occur?  Well,  look  at  it  through  the  eyes  of  a  program  manager  who  sees  the  world  as... 
his  job  is  simply  to  develop  a  system.  A  user  substantiates  the  udliiy  of  die  system  both 
before  and  during  its  development,  and  an  ind^rodent  organisation  evaluates  its  udlity 
once  it  is  develc^ed.  So,  the  program  manager's  role  is  simply  to  be  a  provider  of  the 
system  with  no  specific  requitemeat  to  deal  with  the  ndl^  tn  the  system. 

So.  the  quesdon  arises,  "What  are  the  consequences  of  this  deenqihasis  on  operadonal 
testing  nom  being  a  learning  tool  during  devetoptnent”  We  will  show  a  number  of 
examples  to  indicaie  that  the  consequences  are  negadve,  and  they  are  serious  enough  that 
we  mwe  a  number  of  recommendatmos  to  increase  die  use  of  operational  testing  during 
development  as  a  learning  tool 

What  does  this  have  to  do  with  modeling  and  shnuladon?  Let  me  defer  that  for  a  minute 
and  provide  some  background  on  modeling  and  simulation.  A  model  is  an  idealistic 
representation  of  a  system.  A  simulation  is  a  process  that  perots  us  to  evaluate  a  system 
by  stimulating  a  model  with  assompdons>ioputs*aod  obsoving  responses,  •  the  ou^uts. 


IMPROVING  T&E  EFFECTIVENESS 


Use  Of  Simulation  In 
The  Acquisition  Process 


DEVELOPMEHTANO  OPERATIONAL 

TEST  PHASES  TESTPHASE 


[Slide  2)  Ofcoune  the  Defense  Dquranem  has  been  usinfsiiauliiioii  for  yean  and  yean 
and  in  many  (hifetent  ways  nflsin|  from,  opendons  lesetBch  type  sinnlaiBons  which  help 
in  the  very  eaxly  planning  stages  of  a  new  piednet, »  nodemand  dK  niility,  to  simnladons 
used  by  designen  and  develt^ten.  such  u inietodeeaaoics designen orndar  designen. 
to  suttoladons  to  deal  with  hunian  fimton,  sodi  as  codqat  suonlaioR.,  sunulamn  that 
synthesae  complicated  enviiQoments  that  teat  tysagns  will  oysam  in,  sod>  as  etecttonic 
ccnnbat  envnonments.  and  uldmaiBly  the  war-faming  smmlamms  whidi  ate  osnally  the 
basis  for  our  openiiooal  tests.  Over  the  last  tea  yem  we  have  seen  the  dynamic  growth  in 
the  conqnitun  and  networking  technology  areas  which  are  the  ondexpinaiDg  for  digital 
simiilatioo.  Mwehaveseeaacaciespoothnf  mexeaseinthenseofsiniiilaiion. 

Simnlations  that  can  be  mncfa  mote  elabome  and  moch  lower  in  cost  dun  they  were  in 
earlier  tones,  bis  not  a  vetyiiskypiecSciion  to  state  that  this  will  probably  connnoe  to  be 
true  for  the  next  ten  years.  Since  there  aren’t  many  tfainp  that  get  choqier  and  better  with 
time,  a  qoestiaa  arises  as  to  how  we  can  take  advntage  of  this  evolution  of  technology  to 
permit  siffluluioo  to  have  moe  value.  Now  back  to  nqrdisenssioo  on  openiiooal  tesong 
during  development  for  its  learning  value,  bis  here  that  we  are  going  to  fbcus  our  anendon 
on  simulatioo. 

First,  we  recognuB  that  opendcoal  testing  dining  developmem  is  er^ensive,  and  no  one  is 
going  to  dob  without  clear  value.  So  the  questioa  then  is. 'How  do  we  know  what  tests 
are  worth  runningr  bis  that  role  that  we  are  going  to  assip  to  sinmlaiion;  namely, 
conducting  shnuladoBs  m  he^  us  identify  and  focus  on  areas  of  concem  related  to  the 
uldmateimli^  in  opennoiism  a  pioduct  that  tt  being  developed.  When  the  concern  is 
suffidem  we  should  be  willing  to  run  operatiOBal  tests  tirgeiBd  at  the  issue  raised  by 
ymiiiari^-  you  do  this  will  be  discussed  later  in  the  briefing. 

Another  quesdon  arises  about  simulsiion.  We  have  all  used  simnlsiioas  and  had 
experiences  with  simuladons  which  gave  os  wrong  results.  Somehow  or  odier,  no  matter 
how  many  good  simuldions  we  have  woiked  with,  everyone  always  remembers  the  ones 
that  gave  poor  or  misleading  results.  So  there  is  a  nstoxal  hesitancy  to  believe  simulanms. 
Hiis  is  s  good  thing,  but  the  question  arises  as  to  if  we  ire  going  to  make  more  use  of 
simuladoas,  what  &  we  do  ateut  this  credibility  factor,  especially  when  related  to  the 
decision  process  in  the  Defense  Department  While  I  talked  earlier  about  p^ucdon 
and  operadonal  testing,  there  are  many  odier  veiy  hip>value  decisions  made 
prior  to  producdon  decisions  that  don't  have  opetadcmal  tests  as  theb  basis,  but  rely  more 
on  sifflniaiion.  So  die  quesdon  arises  u  to  how  does  the  dedskxMiiaker  know  wtether 
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he's  making  a  decision  based  on  one  of  these  bad  simulations  or  a  good  one?  This 
question  has  been  asked  in  many  fotms  to  our  panel:  "Should  a  central  organiadon  be 
created  in  the  govemment  to  acoredit  simulations  that  are  permitted  to  be  used  in  decisiwi 
making  processes  similar  to  what  has  been  done  with  opendonai  testing  -  bring  in  an 
independent  team  ftv  greater  confidence?"  "Should  the  government  set  up  a  central 
management  organization  to  not  only  accredit  simuladons  as  being  good  ones  for  use  in 
impartant  dedsioos,  but  to  manage  die  dismbudoo  and  refuse  of  th^  standard  simuladons 
across  a  wider  segment  than  ffligfat  otherwise  use  them?"  Over  the  course  of  the  briefing  1 
will  bring  in  this  issue  of  credibility.  With  that,  I  am  going  to  get  into  the  details  of  our 
study. 


IMPROVING  T&E  EFFECTIVENESS 


Acquisition  Process 


•  ManMw-a  Vaan 


••Mfa 
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[Slide  3]  I  am  goiBg  to  begin  with  a  discussion  of  the  aeqnisitiOB  process,  the  first  of  those 
duee  processes  that  we  are  going  to  try  to  inteneUie.  We  have  a  very  simple  model  here  of 
an  acquisition  process.  It  begins  on  drefu*  left  and  has  three  pans:  the  beginning,  the 
nuddle,  and  the  end.  are  edut  we  usually  thhik  of  as  the  requirements,  the 
development,  and  operationai  test  phase  for  ptoduetion  decision  making.  Ifwelookatthe 
beginning,  that  is,  the  period  where  people  in  the  services  genenUy  are  desirons  of 
irrqiroving  their  capabuity  to  peifonn  a  givett  rrrission  and  are  imagining  tow  some  new 
system  oroewtechndogy  can  be  eqpknied  to  make  that  possible.  To  do  that,  they  have  to 
somdiow  or  other  extrapolate  their  put  expcrierrce  in  mintary  operatioos  with  their  fiuure 
visira  of  systems  and  technol^  to  faring  about  some  vision  of  how  a  new  capability  can 
make  them  more  citable.  This  involves  making  assuamtiuns  such  m  what  the  enemy 
threat  will  be,  what  the  doctrine  will  be  for  ow  own  and  for  enemy  forces,  what  the 
procedures  for  using  this  new  cqwbi^  wil  be,  what  die  deployment  of  die  new  system 
will  be,  etc..  So  a  number  of  predictioos  have  to  be  made  that  go  along  with  the 
extrapolarion  of  the  ntiliy .  During  dds  period  people  get  a  stronger  or  weaker  feeling  about 
the  need  for  a  system  and  sometimes  we  use  those  operatioos  reswrch  type  simuladons  I 
referred  to  earlier  to  give  some  substantiation  to  the  level  of  utility  before  proceeding  with  a 
new  development  m  the  sense  that  it  requires  lots  of  prediction  and  extrapolation,  the 
process,  by  its  nanire,  is  termed  loose.  It  is  not  that  people  don’t  do  a  good  job  or  that  they 
are  not  doiog  their  best  it  is  just  diat  by  its  very  nature  tms  process  has  to  be  loose; 
involving  predictions  about  our  own  forces  as  well  u  enemy  forces.  When  p^le  get  a 
strong  enough  feeling  to  go  on  with  devdoprnent  they  then  bridge  into  that  middle  phase, 
the  development  phase.  The  method  of  bridging  is  the  creation  t»  a  technical  ^edfication 
for  the  product  to  be  developed.  That  technical  specffication  is  created  by  two  groups:  an 
operational  groim,  who  had  dtat  vision  in  die  beginiwg.  and  the  gtoiqi  that  is  going  to  be 
responsible  for  development  which  is  a  more  tecfanically-orieiited  group.  Thtmgh  a 
process  of  further  extr^lation,  that  is  extrapolations  about  lower  levels  of  details  on  the 
technology  and  lower  levels  of  detail  about  what  might  be  the  creeling  between  the 
spedfiotion  and  the  operational  utility,  a  specification  is  botit  By  its  very  nature  it  is  an 
erroneous  process.  Again,  not  erims  due  to  people  malting  bhnuters,  but  the  qiecification 
generation  process  invdves  even  greater  levw  m  prediction  and  extrapolation  than  the 
loose  process  upon  wdiich  it  was  founded  to  stan  with.  Once  that  qiedfication  is  created,  it 
becomes  the  basis  for  a  contract,  a  contraadiat  must  be  tigoro^  managed.  Sowe  shift 
into  a  mode  from  loose  things  and  etror-proire  dungs  to  sonrething  that  has  to  be  managed 
rigorously.  There  is  nothing  wrong  with  that;  it’s  a  necessity  of  contracts.  If  one  looks  at 
the  usual  length  of  the  development  proem  as  tins  dian  shows,  it  lasts  many,  many  years. 
Ihe  program  manager  who  does  a  goodjob  is  one  who  keqis  his  programs  stable  through 
rigorous  management. 

We  call  this  phase  "rigid”  in  the  sense  that  we  go  fins  a  foundation  which  is  loose,  to  a 
management  process  which  is  very  rigorous  arid  whidi  is  founded  on  keeping  things 
stable.  We  thra  bridge  inro  the  filial  phase  which  is  the  operational  testing. 
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Here  an  independent  group  is  brougiu  in  to  determine  whether  or  not  diis  system  really  has 
enough  udli^  that  it  is  wem  producing  There  are  really  three  evaluations  going  on  in  this 
phase.  One  is  the  evaluation  of  the  equipts^  itself;  the  second  is  the  evaluation  of  the  very 
early  work  done  by  the  planners  who  imagined  what  the  utility  would  be  if  such  a  system 
werebuilt.  The  tl^  is  the  evaluation  of  die  process  that  translated  the  vision  of  those 
operational  planners  into  a  specification  that  was  the  basis  for  building  the  systtm,  which 
itself  is  subject  to  error.  So  we  are  testing  three  things,  two  of  which  could  have  been 
tested  ten  yi»is  ago  if  we  had  the  equiptsent  So,  in  the  sense  diat  two  of  the  three  things 
have  had  a  very  long  period  of  delay  without  mu^  additional  work,  diese  tests  are  very 
late.  They  are  also  late  in  the  sense  that  the  cost  of  discovering  a  problem  in  diis  stage  of 
acquisidon  is  at  its  highest.  Either  the  cost  of  the  development  is  sunk,  if  a  program  is 
cancelled,  if  a  decision  is  made  to  rehabilitate  the  system,  thecostof  rehabilitadtm  is  at 
its  highest  because  in  addinon  to  doing  the  redesign  work  for  the  product,  all  of  the  support 
costs  assodaied  with  changing  drawings,  changing  sumon  equrpnent,  etc.,  in  keying 
with  the  modified  design  must  be  borne.  When  we,innct,finda£uluieinoperadonal 
testing,  we  will  refer  to  that  as  a  surprise  because  it  is  hard  to  imagioe  that  somebody 
wouldgooo  devek^gasysiBmfortBayeaxsinanticxpaiiooofaMureattbeend.  As 
everyone  knows  the  Defense  Depaitment,  surprises  are  very  bad,  because  in  addinon  to 
die  surprise  of  the  details  of  diat  panieular  devekpmeet,  the  ciedibiliv  of  the  whole 
acquisidon  process  is  brought  to  the  surface.  And  when  the  process  itself  has  a  lack  of 
ciedi^ty.  it  has  a  side  efmn  00  everything  we  do  that  is  n^aiive  and  ineffideoL  Soitis 
really  very  in^iortant,  from  our  puel's  point  ot  view,  to  avod  suiptises  both  for  the 
instantaneous  cost  on  the  system  in  quesdon,  and  the  credibility  loss  n>  the  process  as  a 
whole. 
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IMPROVING  TCrE  EFFECTIVENESS 


Two  Study  Issues 


•  What  can  be  done  to  help  avoid  auiprtoes 
during  independent  operational  testing? 


[Slide  4]  So  that  raises  the  question,  "Can  we  do  anything  to  avoid  suipiises?"  Thezeare 
two  presosnoons  ben.  Om  is  that  we  have  them  and  two,  that  bopefoUy  they  are 
avoidable.  ^  let  us  talk  about  whether  we  have  them  s  nos. 
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IMPROVING  TErE  EFFECTIVENESS 


Typical  Surprises 


•  "Change  in  AswfnpMons*  Sutpriae 


[Slide  5]  The  fintJdod  of  sivpnse  lam  going  to  discuss  is  what  we  call  "the  change  in 
assumpdofl"  surprise.  This  is  where  thcae  Mriy  planners  made  some  assunqidons  which 
led  them  believe  that  the  system  or  technology  tb^  warned  to  advance  would  be  useful, 
like  the  dtttat.  die  dqdoyment  of  the  system  if  it  were  goiaf  10  be  built,  the  enviraiiment  it 
would  operate  in  if  it  were  used,  and  on  and  on.  Then  tea  years  of  development  go  by  and 
far  whatever  reasons  the  dtreat  changes,  the  d^lmmeat  puns  change,  the  key  features 
change,  some  of  the  basic  assotnpcioos  change.  TnentfaeoperaiioDaltestistunaaditis 
detemnned  diat  diose  changes  ate  so  crucial  that  the  utility  the  pndoct  goes  from  being 
somediiog  dutt  was  irnapned  tt)  be  useful,  to  something  diat  doesn't  seem  like  it  is  worth  it 
AgoodexaoqdetrfdiisisdieDIVAD.  The  OIVAD  was  an  Anny  air  defense  sysieta  It 
was  originally  conceived  in  die  early  7Qs  and  development  started  in  1977.  Its  purpose 
was  to  protect  the  moving  army  from  dose  air  support  attacks  by  fixed>wing  aixciafi  and 
from  standoff  helicopter  attack.  At  the  time  tbeprogiam  was  inmiied.  the  standoff 
hehoroter  threat  was  seen  as  a  three  Idlometer  range  standoff  we^on.  So  the  designers  of 
die  DiVAD  decided  it  needed  a  four-ldlortieterfiri^  range.  WeU  the  development  went  on 
through  1985,  and  during  diat  period  of  devdopment  two  things  baroened  to  that  threat. 
One  is  the  hehcr^Rer  tbrM  beome  primary,  ain  second,  the  range  ofthat  ftog  capability 
of  the  helicopter  ucreased  to  six  Idlometers.  Well,dieopetaiiooaltestwasnmanditwas 
detenuined  that  the  firing  range  of  die  DIVAD  was  madequate  givea  die 
range  of  the  helicopter  threat  The  remit  was  that  the  program  was  cancelled  after  $12 
billion  of  investment  Now  it  is  not  as  if  die  developtnent  community  did  not  know  that  the 
threat  was  changing.  Th^  did,  but  they  lad  lots  ofiadonale  as  to  why  it  was  still  logical 
to  develop  the  DIVAD  with  its  four  Idlooeter  firing  tinge.  The  point  I  want  to  make  is  not 
that  we  should  have  produced,  the  DIVAD,  but  why  did  we  have  to  wait  undl  the  end  of 
the  program  and  having  spent  S1.2  billion  to  decide  that  the  change  in  assumptions  was 
ciuciaL  Is  there  something  we  can  do  to  make  this  son  of  thing  occur  earlier? 
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Bfli  IMPROVING  T€rE  EFFECTIVENESS 


Typical  Surprises 


•  “Change  In  Assumptions’’  Surprise 

•  “Measure  of  Effectiveness’’ Surprise 


[Slide  6]  The  secoi^  type  ^  surprise  I  want  to  talk  about  is  what  we  call  the  measures  of 
effectiveness  surprise.  This  is  where  the  earfy  planners  had  soine  way  that  they  thought 
about  utility.  If  the  system  could  do  it  good,  that  would  be  useful  Then  we  go  through 
dtedevetoptnentprottss  and  an  independent  team  comes  in  to  run  the  test  and  they  come  in 
with  a  different  sa  of  measures.  Different  enouA  that  what  might  seem  like  a  very  useful 
system  no  longer  appears  to  be  useful  The  lesim  being  that  we  do  nounoduce  the  system. 
An  easy  generic  case  for  thinking  about  this  is  when  one  person  says, 'If  the  system  under 
evaluation  results  in  a  capabtli^  that  is  a  lot  better  than  is  u  the  field  today,  and  I  see  DO 
other  way  of  gening  it  in  the  near  fhnrre,  that  is  good  enough  fbr  me.”  whereas  another 
person  says,  "No,  no,  no,  I  draw  a  chalk  line  and  unless  you're  better  than  that,  the  system 
tsnotusefiilasall"  A  good  example  of  that  is  the  Aquila,  another  AimydevelopmenL 
The  AqoUa  was  an  unmanned  airbonie  vdnde  and  its  nuniose  was  to  cany  senson  to 
provide  a  c^abOity  to  take  advantage  of  the  extended  mini  range  of  ani]]^  that  the  Army 
already  owns.  These  weqwos  can  fire  to  15  or  20  kfloeaetersb^ond  the  HjOT;  weapons 
such  as  the  155  Howitxer,  and  the  MUtS  rocket  sysmm.  Yet  tfa^  have  no  way  of 
knowing  what  targes  ate  at  diat  (hstanee  unlns  th^  send  out  anbocne  spotters  or  ground- 
based  sponets;  not  a  ve^  effective  way  for  seehm  what  is  out  there.  Sotheideaisto 
provide  an  unmanned  airbonie  vehicle  widi  aTVsensor,  Ot  sensor,  maybe  even  a  laser 
designaiioo  system  to  aid  those  weapons.  The  origaial  planners  in  1974  said  that,  "...the 
system  was  useful  if  die  season  could  see  half  of  die  srges  out  in  their  area  of  vision,  and 
when  targes  were  observed,  85  pereem  of  die  time  the  weqions  could  exploit  it  and 
actually  kill  the  targes.  If  the  vdude  was  not  that  hard  to  use,  (e.g.,  it  would  take  less 
than  an  hour  to  get  out  on  stanon  and  do  a  usable  job),  that  set  of  omabilities  would  really 
beuseftiltotfaeArmy."  Well  this  program  went  on  fiom  1974  to  1987  and  in  1987  an 
operarional  test  was  nm.  It  only  inclu^  die  TV  sensor.  It  did  not  include  the  full 
capability.  In  the  condua  of  the  test  there  were  some  confusion  factors,  like  p^le  didn't 
Imow  how  to  operate  unmanned  vdncles  very  well  They  had  litde  experience  in  doing 
that,  but  nonetheless  the  test  was  run  and  die  system  generally  satisfied  all  of  those  original 
measures  of  effectiveness.  Nonetheless  it  was  determined  dm  those  were  not  good  enough 
to  warrant  production  and  the  system  was  cancelled.  That  was  an  $812  million  dollar 
development  Now  in  looking  at  the  report  diere  is  no  indication  as  to  what  was  good 
enough.  It  was  only  stated  that  the  synem  wu  not  good  enou^  So  that  raises  the 
question,  "Should  we  go  13  years  through  a  development  with  the  designen  and  planners 
having  a  vision  of  what  was  good  enot^  and  then  arrive  at  a  decision  point  where  it  was 
decided  that  this  was  not  good  enough^  Wl^  couldn't  we  have  had  substantiated 
measures  that  die  community  as  a  whole  had  adopted  u  the  ones?  I  will  also  point  out  diat 
many  senior  military  people  stiU  believe  that  we  should  produce  the  Aquila  arid  that,  in  facv 
if  you  look  at  it  there  is  nothing  coming  down  the  pike  to  give  that  extended  range  to  the 
capable  we^xms. 
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08B  IMPROVING  T£rE  EFFECTIVENESS 


Typical  Surprises 


•  “Change  in  Assumptions"  Surprise 

•  “Measure  of  Effectiveness"  Surprise 

•  “Lack  of  Maturity"  Surprise 


[Slide  7]  The  third  of  surprise  I  wtnt  to  talk  about  is  what  we  call  the  “lack  of 
maturity  surprise."  T^is  where  a  very  natural  teasion  in  development  always  occun. 

You  get  the  system  to  the  end  of  dm  mkidk  sage  of  devdopment  and  it’s  time  to  decide  to 
Stan  the  opetatiooal  testing.  You  have  your  very  Etstnnitt  of  systeoa  available  for  test 
They've  had  no  time  to  mature.  Certain  pans  of  a  design  in  a  system  need  to  have  real  use 
exprtence  before  they  mature  such  as  software  bugs  a^  hardeme  reliabili^,  and  so  you 
haveadeciaootomake.  Should  I  take  die  time  to  get  that  experience,  add  dm  maturity, 
and  therefore  have  a  better  chance  of  running  an  operational  test  with  success,  at  Ae 
expense  of  leaving  a  fikctory  idle:  a  ftetory  dm  has  a  lot  of  people  and  a  lot  of  madi- les 
ready  to  produce.  On  die  other  hand,  showd  ffttf  enter  tiif  test,  **T*"^g***  *v****!*!^tiiiTd 
as  prematurely,  in  the  sense  of  maturity,  but  hopefiilly  be  able  to  get  ihtougb  an  qieradonal 
test,  get  a  dedsiao  to  produce,  get  dm  fmtoiy  woridng  u  qniekly  u  possible,  and  do  the 
manning  durii^  the  mislead  time  it  takes  to  get  the  fimprodnetioo  units  out?  Almost 
invariably  the  Defense  Depantuem  takes  die  Imer  course  of  a  riskier  entry  into  operational 
tesa  to  pin  the  ecraomics  of  rapid  production.  This  is  probably  a  good  tiling  to  do; 
however,  on  occasion,  you  run  a  test  where  those  nmuruyihinp  bite  you.  Agood 
example  of  dm  occurred  with  JTIDS.  a  tri>service  ahboroe  dannnk  system  dm  was  being 
evaluamd.  In  dds  case,  die  objeedve  wuto  have  a  system  with  400  hours  mean  time 
between  failures  via  ground  test  as  scapabOity.  When  JIIDS  entered  die  operatiooU  test  it 
had  about  ten  percent  of  dm.  As  a  result,  the  openuioaaliesien  noted  dm  the  reliability 
didn't  seem  very  good  and,  in  fact,  it  disrupted  tbe  ability  to  ran  good  operational  tests.  So 
diey  rightfully  smed  dm  tbe  reliability  q^eaied  inadequate.  Now  this  was  no  surprise  to 
the  de^opment  community.  Theyknewdmdieseuniahadnotlmltdmicetonmure, 
but  it  was  a  great  surprise  to  all  of  die  Tri>Setv^  people  who  are  pcemted  with  the  results 
of  an  opentional  test,  and  who  have  no  idea  about  die  stams  of  nmurity  when  the  test 
starts.  This  raises  all  kinds  of  issues  like,  does  the  orgimxaiion  developing  the  system 
know  trim  it  is  doing?  b  the  system  any  good?  Is  the  msnsgement  grotgi  in  die 
government  incoometent?  Thesulijeatnn^isfitxnatencuJnDStosiestofthe 
credibility  of  die  v^eacqoiadon  system  duu  created  JTTDS.  As  I  stated  earlier,  that 
really  doro  not  do  anybody  any  good  and  it  is  also  very  inefBdent  Why  is  it  inefficient? 
Red  teams,  special  panels,  btiefuigs  to  everybody  who  has  any  affiliation  with  the  use  of 
dm  system  and  invesunem  to  mate,  Stan  at  a  very  rapid  pace  and  use  up  a  very  long  period 
of  Dmebdfore  credibility  is  regiiii^  In  fmt,  on  the  JTTIDS  program,  vrimh^pened  is 
over  the  period  of  a  year  or  so,  while  aU  dm  was  going  on  the  real  system  was  being 
amased  and  evennudly  showed  via  tests,  about  80  percent  of  the  mamre  reliability,  and  in 
fact,  is  now  back  in  a  more  nccmal  mode  of  development  But  at  die  expense  of  a  very 
long  poiod  of  otedibiiity  loss;  credibility  dm  will  probably  never  be  My  regained  in  dm 
system.  Perhaps  we  should  share  the  knowledge  m  the  state  of  maturity  bem  we  enter 
operational  test  with  the  M  set  of  involved  players,  rather  than  have  a  very  large  set  of 
p^le  be  surprised  by  the  end  result  of  the  deemnstranon,  dm  the  system  is  not  mature. 
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Qfli  IMPROVING  TErE  EFFECTIVENESS 


Typical  Surprises 


•  “Change  In  Assumptions”  Surprise 

•  “Measure  of  Effectiveness”  Surprise 

•  “Lack  of  Maturity”  Surprise 

•  “Lack  of  Usability”  Surprise 


[Slide  8]  The  last  example  I  want  to  give  I  call  the  "lack  of  usability  surprise.”  This  is 
where*  for  whatever  reasons  dw  user*  the  real  user,  the  soldier  or  pilot  or  sailor  who  is 
going  to  use  die  system,  has  had  no  dunce  to  ay  the  system  until  the  opetaiioDal  test  That 
is.  the  development  oommnniiy  had  sunogates  oying  it  during  development  and  so  the  very 
first  nial  of  its  use  is  during  openiiimal  test  The  surprise  is  when,  in  fimt  the  real  users 
donotlikeit  They  ^  the  system  might  be  good  and  an  that*  but  we  cannot  use  it  Itis 
too  hard  to  use.  We  just  do  not  like  it  Wedonotwantit  This  really  can  happen.  A 
food  exan^  of  this  is  a  digitai  netwoik  built  for  die  Snategic  Air  Command,  caUed 
SACDIN.  This  was  a  neiwm  whose  purpose  was  to  dissetnidae  emergency  action 
messages  to  the  saaiegicfbtoesiineaffe  and  to  receive  sums  back  from  that  mme.  As  you 
can  imagine,  this  has  to  be  a  very  secure  networit  The  system  that  was  being  develop^ 
SACDIN,  utilized  message  entry  temanals  eoaparaUe  to  yov  wockstatiott  of  today  tw 
might  exist  in  your  office  for  a  computer  system,  except  diese  wciiaations  had  to  assure 
security  and  dune  were  aU  lands  of  software  measures  to  assure  security.  WeU,  the 

system  was  developed  over  the  ncitnal  course  of  time  and  the  operational  test  was  run. 

One  of  die  features  built  into  this  system,  (and  you  have  to  untwrstand  this  was  the  most 
secure  system  imagined  for  diat  time,  nob^  had  ever  gone  this  fiff  in  bmlding  security,  so 
the  desigpen  for  security  were  very  nervous)  was  that  if  a  terminal  had  a  lot  of  wnxig 
inputs  put  in,  in  sequen^  it  was  possible  t^  this  wu  a  seeori^  breach  -  someone  was 
trying  m  some  way  to  disropt  die  synem  -  and  they  fioze  the  audited  die  most 

recent  data  and  blew  a  wfai^  to  bring  in  a  security  officer.  That  was  how  die  system  was 
designed.  Now  you  get  to  the  operational  test  and  you  have  a  user  using  it  who  does  not 
have  a  lot  of  eiqimence  with  that  particular  system  and  makes  some  errors,  and  he  makes 
enough  of  them  that,  in  fact,  it  satisfies  the  criteria  for  beiM  a  potential  security  breach  and 
itfreeaes.  At  the  end  of  the  test  the  individual  user  says,  T  cannot  use  that  sniff,  every 
time  I  make  a  mistake,  instead  of  helping  me  it  fieeas,  it  is  not  usefiiL  I  do  not  want  it" 
WeU  you  go  and  fix  that  You  change  some  software;  that  usually  handles  manen  like  this 
and  it  is  fixed.  WeU,  to  achieve  the  coomter  security,  one  had  to  have  a  me^cation  that 
was  mathematicaUy  verified  as  maintauting  security  nimsfess,  manual  validation  that  the 
real  software  matched  the  specification,  and  then  «nploy  pntfessional  teams  whose  job  it  is 
tony  to  do  code  breaking,  to  see  if  they  can  toeak  the  system  or  not  AU  diat  has  to  be 
done  through  a  regression  test  process  to  accept  the  new  changes  in  die  software  that 
satisfy  that  man-machine  problem.  Inthatcase  we  spent  about  a  year  redoing  SACDIN  to 
solve  the  problem.  In  our  panel's  view,  today  this  should  never  happen.  Youshould 
never  have  a  case  where  man-machine  interaction  has  not  been  evaluated  by  use  of  rapid 
prototypes  and  simulation. 
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BM  IMPROVING  T&E  EFFECTIVENESS 


Two  Study  Issues  (Concluded) 


•  Whaf  can  be  done  to  help  avokf  surprtoes 
during  Independent  opeiatlonal  testmg? 

e  How  can  we  Increase  the  value  of  modeling 
and  simulation  in  the  acquisition  process, 
and  be  confident  of  its  use? 


[Slide  9]  Now  I  am  going  to  move  away  from  tests  and  get  into  die  area  of  simulation. 
Then  later  I  will  bring  it  all  together  widi  our  panel's  recooameadaiioas.  So  let  me  talk 
about  simulations.  1  am  going  to  begin  by  dianumng  that  mpie  of  eomfideiiee  in  riwMjljtnpn 
that  I  referred  to  earlier  SM  tto  I  aw  get  into  die  nue  of  sunuladoo  to  help  us  focus  on 
operational  testing  during  development  as  a  second  subject  So  let  us  talk  about 
confidence. 
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DSB  IMPROVING  TCrE  EFFECTIVENESS 


Comparison  Of  Simulation  Results 
Among  Six  OTH  Radar  Organizations 
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[Slide  10]  This  first  set  of  exanq)les.  is  to  give  an  iodicatioa  of  bow  hard  it  is  to  assure 
confidence  in  siimilations.  The  first  exa^k  is  associaied  with  an  over-the*hofuonndar 
acsvity.  Over'tlie«hanzon  radars,  are  high  frequency  radars,  which  transmit  high 
fiequiacy  dectrotnagnetic  waves  thax  bounce  cm  the  iononhere  and  Qluminaie  targets  well 
bey^line-of-si^  at  very  long  lange.  Then  these  signals  are  reflected  back  at  the  radar 
for  targm  via  con^arable  return  padis.  These  were  devek^  to  see  100  to  200  square 
meter  size  airctaft  taq^  such  as  bombers  that  might  be  attacking  the  United  States. 
During  the  period  of  time  in  which  these  early  systems  were  devdoped,  questions  started  to 
arise  about  cruise  taissiles  because  the  Soviet  threat  was  changing  to  having  standoff  cruise 
missile  capability,  which  had  smaller  cross  sections.  The  question  was  asi^  "What  son 
of  a  modi&ation  could  be  made  to  these  radars  we  already  invested  in  for  bomber  sized 
t^ets  to  make  them  citable  of  seeingcruise  missiles?”  Even  further  quesdoos  were  asked 
lik^  "Do  I  have  to  modify  them  at  alir  Because  it  turns  out  that  if  yon  know  about  HF 
propagation,  you  understand  that  the  pafwmance  of  the  radan  is  very  situation  dependent 
It  depends  on  the  path  of  ptopagatitm,  it  depends  on  the  time  of  day,  it  depends  on  the  time 
of  year,  it  dqtends  on  where  in  the  sunspot  cycle  the  period  is  that  you  are  operating  in,  all 
because  of  die  effects  of  the  ionosphere.  So  a  lot  of  margin  is  put  into  the  dragn  of  over* 
the*hoiizon  radars  to  handle  off  normal  situatitnis.  So  it  was  not  a  ridiculous  to  say,  "Well 
maybe  in  the  mote  ncnninal  propagation  conditions  you  could  actually  see  very  snoall  things 
even  though  the  radar  was  design^  to  see  bigger  d^gs.”  Well  a  bimch  of  experts,  and 
they  are  lii^  on  this  chart,  who  have  a  Imtg’term  experience  with  over-the*horizoo  radar 
were  going  around  giving  different  answers  to  this  question.  So  smne  very  wise  person  in 
the  government  said,  "Well,  you  know,  it  could  be  that  they  are  all  thinking  about  different 
assumptions  when  t^  are  answering  the  question,  and  iff  propagation  is  so  assumption 
dependent  that  what  we  have  to  do  is  get  tb»  on  an  even  ground  on  the  assumptions."  So 
the  government  set  up  a  special  proem  where  all  of  these  pet^le  who  had  simulaiion 
models  that  they  had  used  regularly  in  the  course  of  dealing  with  dieir  bomber  sized  target 
analysu  and  reliably  so,  were  asked  to  look  at  identical  circumstances  and  give  answen  to 
what  size  targets  could  these  radan  see  under  a  given  circumstance?  Each  of  diose  black 
ban  reprmnts,  for  this  one  case  that  is  shown  on  this  chart,  an  answer  that  one  of  these 
organizadons  gave.  And  as  you  can  see,  they  vary  from  one  organization  saying,  ”80 
square  meten  is  the  size  you  can  see,"  to  othen  saying  you  could  see  targets  in  the  ones  of 
square  meters,  which  is  more  in  the  range  of  what  we  are  talking  about  when  talking  about 
cruise  missiles.  So  how  could  it  be  that  all  these  experts  with  validated  sunuladon  models. 


gave  such  widely  different  answers?  Well,  obviously  the  extrapolanoo  of  the  models  for 
[his  new  question  were  unreliable,  at  least  far  a  bunch  of  them  if  not  all  of  them.  So  what 
do  you  do?  Well,  two  things  were  done  in  reality.  First,  it  was  decided  u  tun  a  real  test, 
anddroneswereflown  which  were  to  be  facsimiles  of  Cereal  target  to  see  what  the  radars 
we  had  were  capable  of  doing.  You  would  nMd  to  run  lots  and  lots  of  expetiments  to  get 
all  the  cases,  but  at  least  the  tests  would  provide  an  abiHty  to  calibrate  the  simulation 
models.  Now  the  second  thing,  you  could  look  at  the  details  of  the  models,  which  was 
done,  and  it  turns  out  that  they  lacked  fideli^  relative  to  accounting  for  all  the  propogadon 
phenomenology,  such  as  iono^hetic  focusing,  and  multipath  effects.  Phenomenology 
effects  which  were  not  critical  in  dealing  with  100  to  200  square  meter  targets,  had  become 
very  imponant  in  dealing  with  smaller  targets.  So  we  have  a  case  here  of  valid  models 
extrapolated  for  use  to  problems  that  seem  like  they’re  the  same  problem,  but  are  different 
enough  diat  the  extrapolation  is  erroneous.  Ihepc^  being,  that  if  I  want  confidence  in  a 
simuladon  tno^  I  not  only  have  to  know  about  the  model,  but  I  have  to  know  about  the 
extrapoladon  involved  in  dealing  with  the  specific  problem.  So  you  have  to  know  two 
ttogs  very  welL 
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Marketplace  Validated  Models  Example 


Response  Time  Models  for 
Oats  Processing  System 

Electronics  Reliability  Models 

SLAM  II 

RPP  ^ 
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Oracle 
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[Slide  1 11  The  next  cjumpte  I  want  to  deal  with  is  whit  we  call, ’’markeiplace  validated 
models."  These  are  models  which  coo^ioies  create,  sell  and  ocher  companies  use  all  the 
dme.  The  goveniioem  uses  diem  aU  the  time  alsa  And  the  maii^iace  is  die  test  of 
validation  in  that  people  buy  them  and  use  them.  Again,  I  am  gomg  to  illustrate  problems 
with  validation  with  valid  models.  Ihefizstexaoqilelwamtogiveiredieseieqionsenme 
models  for  data  processing  sysaans.  Here  the  (joestion  is,  Tam  building  a  big  disBibuted 
dm  processing  systen^  mayw  a  woridwide  system  with  a  lot  of  users  on  it.  and  I  would 
like  to  know  fim  the  dme  a  user  pushes  a  bum  requesting  dm  or  asldng  for  some 
functiooiobedonebydiesystem,whatisdiedmeittikesiogetareq)oose?”  bisaveiy 
noraial  quesdon  to  ask«  &  turns  out  thu  one  of  the  key  Ckiocs  in  denmining  that  answer 
is  when  we  call  "contendon";  that  is,  when  other  usen  happen  to  make  requests  that  ask  the 
same  processors  to  ftmcdon  at  the  same  dme  as  dus  first  user  who  I  asked  the  quesdon 
about  If  there  is  a  lot  of  contendon  and  the  details  of  how  that  is  technically  managed  are 
inoGfident.  responses  could  take  a  very  long  dme.  And  if  dwse  is  not  a  lot  of  contendon, 
thf  dfffW  iwrtiiiiral  HMMgienient  i«  very  efReiem.  lerninmi  might  not  take  a 

longtime.  Well,  an  input  to  the  simuladonmodeb  then  is  a  nndicdon  about  contendon. 
Well,  bow  does  some  software  designer  sitting  in  a  ftctory  designing  software  Imow  what 
the  contention  will  be  in  some  system  that  has  not  been  fielded  yet?  He  doesn't.  So  he 
makes  some  judgment  as  to  whju  he  thinks  the  contendon  wiU  be  People  with  experience 
in  this  field  know  that  we  very  fto^ently  make  nd^gments  on  this,  and  as  a  result  we 
get  very  wrong  predictions  out  of  these  kinds  of  models.  So  this  is  acase  of  what  we  call 
"gaibage-in-garbage-out".  Now  what  do  you  do  about  that?  Well,  e^etienced  people 
know  you  try  to  get  evlier  versioos  of  a  system  out  into  the  field  and  get  some  early 
measurement  of  what  this  contendon  might  really  be  based  on  field  experimmts.  In  that 
way,  they  are  able  to  provide  some  validadon  to  the  dm  and  avoid  garbage*in. 


The  other  example  has  to  do  with  reliability  models.  The  problem  here,  is  one  we  all  have 
to  deal  with.  Namely,  we  have  the  design  of  a  piece  of  electronics  sirnog  on  a  desk  and  we 
are  curious  about  what  its  reliability  will  be  before  we  go  and  develop  it.  These  models 
will  tell  you  that,  and  they  will  account  for  thinp  like  the  quality  of  the  pans,  the  thermal 
stresses,  and  the  electrical  stresses  that  the  synem  will  have  in  operation,  and  through  some 
integrated  set  of  calculations  will  deteimine  «dtat  the  mean  time  to  failure  will  be.  People 
use  these  all  the  time,  but  people  with  a  lot  (tf  experience  know  that  this  only  accounts  for  a 
set  of  factors  that  deal  with  pare  factors.  It  does  not  deal  widi  facton  like  workmanship, 
manufacturing  quality,  issues  like  the  mechanical  rigidity  of  the  boards  used,  whether  in 
faa  they  will  buckle  and  the  connectors  that  you  have  ctmsen,  are  adequate  to  deal  with  that 
son  of  thing.  So  it  really  gives  you  an  accurate  answer  for  a  pan  of  the  problem,  but  it  is  a 
pan  of  the  problem  that  determines  oidy  a  fraction  of  the  rdiability.  So  people  who  know 
about  this  are  wary  about  the  coirqtlete  answer,  but  nonetheless  see  the  value  of  the  partial 
answer.  That  is,  they  are  assured  that  tire  pare  will  not  take  them  down  in  terms  of 
reliability.  Somebody  who  is  validating  t&  model  cannot  really  know  what  each  factory’s 
quality  is  and  what  tiie  woriomanship  standards  ate  in  each  fact^.  It  gets  to  be  a  very 
specific  determination.  So  that  is  another  wajr  that  the  ability  to  validate  a  model  really  does 
not  validate  tire  result  when  presented  for  deosion  malting. 

So  tile  point  of  those  dvee  examples  was  to  iQusitate  that  tins  concqn  of  validating 
sifflulatioas  is  really  a  very  eticky  business,  fr  is  difScult  to  imagine  how  one  group  could 
do  that 
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Joint  STARS  Example  Of 
Simulation  And  Test  Performance  Verification 
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[Slide  12]NowIwuttoswitcfatothesiU)jeetafopeniioaaltesiinf  tndtheroleof 
siffluUtioo  coupled  to  that  I  am  going  to  use  an  ewaplewliieh  we  believe  is  an 
ouisiaoding  exin^le  of  how  sisttdlanoos  should  be  us^  but  we  fed  it  is  not  oonunon 
enough  in  defense  acquisition  systems.  This  has  to  do  with  Joist  STARS.  JointSTARSis 
an  airborne  tadar  system  that  is  c^wble  of  seetna  slowHBovisg  vehicles  like  tanks  on  the 
battlefield  and  serves  as  a  surveillance  lesoBtce  for  many  weapons.  Let  us  look  at  this 
chart  Itiscottiplkated,  butlthiaktfswqrththeegfastThegra^ootheupperleftisan 
operational  sunaladon.  What  dnssumladon  does,  is  it  deals  wim  the  odliQr  of  the 
surveillaooe  system.  Andthev^itdoesthisisasfidlows:  first  we  leeogniae  that  the 
ability  to  sense  moving  targets  in  Joint  STARS  depends  on  sensinf  the  vdoc^  of  the 
target  dhectly  as  the  rmbr  beam,  the  radial  ponioa  of  velocity.  So  if  a  target  is  moving  at 
any  speed,  but  onhogonai  to  the  radar,  it  is  not  seen  at  alL  Ifitismovingdireetlyatthe 
radar  beam,  it  can  be  seen  dqpending  on  what  minimum  velodQf  that  radar  is  sensitive  to. 
Sothequesdonis,  ’What  should  be  the  minimamvelodsy  that  radar  is  sensitive  min  order 
to  have  utility...  ten  miks  an  hour,  five  miles  an  hour,  a  an  hour,  what  should  that 
be?”  Wdl  die  curve  on  the  imper  left  that  dealt  with  that  subject  is  doived  as  follows;  a 
modd  was  created  which  moit  the  roads  (rf  Etsope  and  the  oCf'raad  areas  that  were  usable 
by  tanks  and  tracks.  Ihe  modelers  laid  a  hypothetical  Soviet  fence  of  moving  vefaides 
down  on  this  area,  and  dieydten  calculated  by  geometry  the  fiacdon  of  the  total  speed  (rf 
each  dT  those  vehicles  that  would  be  pointed  at  the  radar's  beara  They  then  computed  if 
the  radar  oould  only  see  some  minimum  vdodty  or  greamr,  what  ftac^  of  that  target  set 

would  be  visible?  And  what  the  curve  shows  is  that  the  specification  turns  out  m  be _ 

kilometers  per  hour.  Then  if  we  could  see  Idkaietersperhour.  we  would  see  about 

75  peicem  or  70  percent  of  all  the  slow>moving  targets  like  tanks,  and  we  would  see  95 

percent  of  the  fost'inoving  targets.  And  if  we  could  not  see _ idlotneters  per  hour  but 

could  only  see  greater  velocities,  the  fall-off  rate  is  pretty  £ut  in  terms  of  the  peremtage  of 
the  tanks  that  this  system  could  see.  So  ftom  this  curve,  people  became  very  desirous  of 
having  that  radar  sensitive  m  very  slow  speeds.  You  can  understand  diat  WeUt 
correqwnding  simulation  was  d^  which  is  shown  on  the  upper  right,  and  that  was  a 
simuladon  done  by  radar  designers.  They  did  a  design  ttfwhtu  sensitivity  would  this  radar 
have  m  have  in  onto  m  see  targea  of  some  minimum  speed  and  foster.  They  did 
simolations  which  ended  up  with  adesign  curve  of  the  sort  shown  here,  which  showed  that 

if  we  take  the  specification  value,  a  given  level  of  sensitivity  would  be  needed  m  see _ 

Idlomemn  per  hour.  To  see  slower  qreeds,  you  would  have  m  move  down  a  very  steep 
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curve  of  sensitivity  irnprovements  to  see  even  small  increments  of  lesser  velocity.  They 
knew  they  were  pushing  the  technology  as  hard  as  they  could  even  to  get  to  the 
specificadon  value  on  the  curve.  And  that,  in  fact,  is  what  ended  up  determining  where  the 
specificadon  should  lie.  Now  they  could  have  stopped  there  and  in  a  briefing  to  decision¬ 
makers  said,  '^ell,  the  spec  is _ kilometers  per  hour,  we  have  done  some  simuladon. 

and  we  meet  the  spec."  But  they  did  not  do  that  They  also  looked  at  the  top  pan  of  the 
curve  above  die  star  and  they  rtcognixed  that  if  thdrsuradadons  were  inaccurate,  by  not  so 
lar^  an  amount,  they  could  drift  into  a  portion  of  the  design  curve  where  they  would  be 
losing  a  large  amount  of  sensidvity  to  slow^moving  targets.  So  it  beccHnes  important, 
given  that  a  drift  in  that  direaion  due  to  a  simulaiioo  error  could  yield  a  signifijbnt  drift 
downward  on  the  other  curve  of  utility.  So  it  is  a  case  where  you  have  two  slippery 
slopes.  And  when  you  have  two  slippery  slopes,  it  is  a  good  rule  to  do  added  work  to 
make  sure  that  you  do  not  slip. 

Given  the  focus  siamlgaon  put  on  a  potential  opemional  probIein,Jm  aggressive  progr^ 
to  do  esiy  operational  evaluarioo  were  established,  to  detemiine  wbetber  these  simulanons 
were  accurate  or  not  and  so  that  they  would  not  wait  10  years  to  discover  whether  or  not 
diis  product  has  ttdlity.  The  first  pan  was  to  take  the  budget  of  input  errors  that  creates  that 
design  curve:  antenna  desi^  antenna  stability,  oscillator  stability,  the  algorithms  for 
sienal  processing,  lots  of  ttogs:  and  understand  the  inoot  that  creates  that  oumut.  and  be 
sure  that  they  were  not  doing  garbage  in-garbage  out  Since  you  cannot  have  the  S3^iem 
the  first  day  of  the  program  you  can  at  least  stan  with  the  iiqims  and  in  fiict  afier  doing  the 
siinulatioo  work  thi^  have  been  doing  antenna  range  testing  algoitfam  resting,  vibration 
testing,  all  as  ptecurson  to  understand  that  at  least  they  bad  the  lii^  inputs  to  their  model. 
They  scheduled  to  run  enenimeots  at  the  earliest  dsoe  possible,  with  subsets  of  the  whole 
system,  but  as  full  a  capainliiy  for  die  radar  as  possible.  Urey  have  scheduled  an 
operadonal  test  in  Eurc^  whim  drey  can  derenniire  that  those  design  curves  are,  in  fret, 
accurate;  socorare  enough  that  they  ate  not  going  to  slip  down  that  udliiy  curve.  Atthe 
same  time  they  are  going  to  start  to  data  link  the  data  to  the  openuioiial  nsen  so  that  ih^ 
can  Stan  to  get  early  indwredoadiat  the  sysrem  win  have  udl^  at  the  end  poinL  Itistfais 
concept  (dr  suflnlukn  diat  we  think  shc^  be  stressed,  namely,  as  a  focusing  mechanism 
for  running  expensive,  but  very  useful  operananal  rests,  as  early  on  during  the 
developtrrent  process  as  we  can,  rather  than  waiting  those  six  to  ten  years  for  die 
ind^radem  rest  to  be  the  first  crack  at  understanding  whether  the  sjrnems  we  are  building 
have  utility  or  not. 
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[Slide  131  Now  I  am  going  to  discuss  the  reoommenditinni  of  the  panel  and  it  is  a  vwy 
weUknowpand  widi  a  set  of  experieace  that  corns  tfl  of  these  tonics  in  Iwill 

higbl^^  socne  insmictUxtal  things  that  woe  said  individiials  Of  the  panel  since  I  thought 

they  woe  osefoL  Norm  Augustine  made  the  ptnm  that  we  dxmld  think  of  smuladon  not 
as  someddng  that  cmifixxDS  or  rejects  a  hypoctens,  but  as  something  that  is  a  mind 
extender;  it  makes  ns  dunk  about  an  atea  m  coocem  that  we  would  not  have  focused  our 
aseotioa  on  otherwise.  As  a  result  smuladon  leads  ns  imo  an  area  of  evaluation  that  may 
mm  out  to  be  crucial  that  we  might  not  have  otherwise  cxak  with.  Phil  Shutter  made  the 
point  in  commenting  on  the  notion  of  centrally  managing  the  reuse  and  disinbutioo  of 
simidaiioo,  that  it  ignores  a  very  iuTOctant  pout  that  the  designen  of  the  simnladoo 
themselves  cany  all  d>e  knowledge  about  rmatwem  into  it  can  be  extrapolated,  what 
cannot  Tiansfemng  the  models  widtout  the  designers  would  be  a  error  because  many  of 
these  codes  are  so  Itfge  that  it  would  impossibte  for  anodier  organization  to  understand  all 
that  went  in.  Jack  Vessy,  when  looking  at  our  lecommendatioBS,  decided  to  quote  George 
Orwell  who  said,  "diere  are  times  when  the  ftrst  duty  of  le^onsible  people  is  to  testate  the 
obvious”  and  that  is  how  we  view  these  reconuneodaiions,  as  a  statement  of  the  obvious. 


Bfli  IMPROVING  T£rE  EFFECTIVENESS 


An  Evaluation  Framework 


•  Consistsof: 

-  Measures  of  effectiveness 

-  Environment  imd  threat  definitions 

>  System  assumptions 

-  Role  of  testing 

•  Planned  M/S  for  evaluation 

>  Establishing  M/S  credibility 


Rccommendarions 

[Slide  14]  I  need  om  chan  pcicettprMeainfieconnnendiiicBstp  provide  edefiniBOP  of 
what  we  call  ao  e^uanoD  fnnew^  &aapiieseaintiipforeacfaprDftain,attfaesan,a 

set  of  tseasares  of  cffeeixveoess,  assoB^tiaos  Qike  tboie  itaeaa  aod  th(M  awaamena), 
and  bow  one  plans  to  use  testing  and  sinxBiaiiaa  as  augmenaiioiis  to  each  other  on^^  life 
of  thistvognm.  Now  there  is  no  doubt  that  everyone  of  these  will  be  wicwgdte  first  ti« 
you  do  it.  but  also  unagae  that  you  were  waBni,  in  addidoo,  to  setting  dns  upj»  continue 
to  refine  it  as  you  pin  knowledge  tiiroodt  the  devetoptnent  planning  process.  WewiUcall 
that  set  (rfinfcnnation  an  evaluation  ftaaeworie. 
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rwICMiQ* 

•  Need  graatar  emphasis  on  operationai  evahialion  during  the 
devetofanenl  process,  inekiding  the  OT&E  community 

•  IVS  can  asaisi  In  defining  an  evahitflon  framework  for  a  program 

a  Irwolse  the  DOT&E/OTAE  community  to: 

-  Pardclpsie  eerty  In  the  rer|uirements  proceea 

-  Thmsliy  ttys  operatlortsi  tequiramerda  to  an  operational 
evaluallon  frarriework 


a 


UOGUmm  um  •WUSDOfl  IrMlWWOflC  IMI  NIMlinM  IM  roi69 
of  M/s  and  taating  for  evaluation  ob)ecttvoa  at  each  ndleatone 


CAUTION  —  The  devaiopmatd  community  must  taka  the  lead  In 
thaea  efforts  ahwa  they  wM  Impiemerd  the  framework.  The  OT&E 
conanunity  must  maintain  thek  Indspendenca  through  the 


(Slide  15]  Our  fim  recommendation  uysdut,  as  I  stated  right  tt  the  begiiuing,  we  think 
that  we  leally  need  to  etx^ihasis  the  iearnini  role  for  opetaiianal  testing  durinf 
development,  and  that  we  would  like  ro  see  dds  occur  and  be  mechanised  through  the 
setting  up  of  these  evaluation  fiatnewotks  at  the  salt  of  the  programs.  This  will  also 
provide  assurance  that  we  will  not  tun  into  the  sainiae  of  assanqnians  as  we  did  in  the 
DIVAD  example,  or  changes  in  the  measures  ofefectiveoess  as  we  did  in  the  Aquila 
example.  Todotbaiwel^SDiirvdvetbeindcpaktemopcsaiionalastpwplet^&om 
thesaiL  We  cannot  let  dtem  just  come  in  at  the  end.  This  raiaes  two  camions.  Thefiist 
caution  is  diey  will  not  letnain  independem  if  diey  ^involved  in  the  psopams  earlier  than 
when  they  do  now.  Our  view  is  diat  coofosing  aloPbess  and  ind^endence  is  a  big  enor. 
The  value  of  die  ind^endenee  is  that  th^  have  Isiowledge  to  provide  and  a  management 
chain  that  gives  them  the  independence  to  provide  it  To  ksq^  diem  aloof  is  to  lose 
something,  not  to  gsinaommlung.  We  recommend  thit  thsrconfiisiminotdi^ieUttris 
recommrndaiion. 

The  second  caution  is  bow  can  these  opetadonaliesten  layout  this  evaluation  framework; 
they  are  a  small  group  and  they  do  not  have  the  wherewiifaaU  or  knowledge  to  do  that  We 
are  not  recnrnmending  disc  we  are  saying  that  the  developing  community  should  take  the 
lead  in  laying  out  an  evaluation  framework,  but  the  cooBBunity  of  operational  people 
should  be  invtdved  in  agreeing  to  and  continuing  to  modify  the  frainework  as  the 
developmem  goes  ort 
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Finding: 

•  The  foundation  of  the  acquisition  process,  the 
operational  requirement  and  its  translation  into 
svsiam  terms,  can  be  Improved 

•  M/S  can  help  by  providing  sensitivity  analyses 
whh^heip  to  Isolate  requirement  areas 
needing  concentrated  evaluation 


necorrenenoaiion: 

•  Service  acquisition  executives  ensure  that  M/S 
excursion  analvses  are  applied  systematically  to 
help  reach  and  maintain  a^eement  on  major 
aspects  of  requirements  and  system  performance, 
and  to  fOcus,  not  replace,  testbig 


[Slide  IdlUieseoQadieeQaBieiidaiioawe&udehastodowithtfaeideeifattwedonot 
warn  simulation  to  prove  or  diqyrove  titings,  but  we  want  it  v>  isolate  higb  sensitivity  areas. 
Sensitivities  that  ooold  make  onr  view  of  otflity  wrong,  if  we  are  on  die  wrong  side  of  that 
sensitive  carve.  We  warn  to  establish  an  importam  role  for  simnlation  in  doing  those 
sensitivity  analyses,  and  this  being  tbe  tnetbod  of  focusing  us  on  those  early  operational 
tests.  So  that  has  to  go  on  right  at  the  start  of  progiams,m  we  think  that  tbe  DoO  has  to 
setqtasystemtomfKtaskftrdmns.  We  do  not  warn  to  see  fitted  poimsimalatioo 
resuhs;  we  warn  to  see  excursion  analyses  wfaidt  win  then  be  the  bads  for  decufing 
whether  or  not  easiy  operational  testing  sbonld  be  done;  fba»  not  rqdace  testing. 
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FindinQs: 

•  Actevclopfn<f<pteg»n,a»embodtodln«ip<clllMtion<nd 
contr^  may  baoomc  ovwty  rigid,  iMtrleling  Itw  wWn^^ 
to  avaiuala  and  ineoiporam  changaa  aa  thraat,  tachnology 
and  knowladga  avolva 

•  M/S  la  a  tool  tar  continual  avalualion  of  polanlW  changea 


•  The  USI)(A)  aataMah  poNey  and  previda  guldanea  to  tha 
actMMtion  community  tor  aystamatlealy  raavahnling 
ayaiam  apaciflcatlona  taring  M/S  and  taat  raauWa 


[Slide  17]  The  thnd  lecoomeodatian  has  »  do  with  the  development  petiod  winch  earlier 
on  we  called  rigid.  Now  simulanoo  is  an  evaluatkn  tool  as  I  stated  earlier,  and  if  we  have 
a  management  proc^  that  anetqtts  to  keq)  cooncts  and  ^edficanons  fix^  then  there  is 
no  room  for  evainaiioii.  becanse  the  purpose  of  evaluadoo  is  to  detemane  whether  those  are 
cocieaornoL  We  believe  that  the  current  state  of  the  acquisition  process  is  sudi  that  we 
sofieevaluadon.  Rogrm  managers  ate  not  modvated  to  find  teasoos  that  dteir  programs 
are  not  conect,  they  are  interested  in  stabUizing  things,  we  have  a  culture  that  does  not 
wantdoevaloatiomwecanbaveaUdieevaloatioiitooisinthewaldandwewilloot  do 
evaluation.  So  this  recommendation  states  fiat  we  think  that  OSD  has  to  do  somt^g  to 
change  that  cuboe.  Nowthatisaveiy  hatdreeomineadatiQo;wedooothaveaq)eci& 
one  switch  you  turn  that  fixes  diis,  but  nooedieless  we  ddnk  that  is  cndal  if  we  are  going 
to  gR  any  value  out  of  die  tools  and  the  capt^des  of  sifflulaiioc.  This  raises  another 
caudon.  A  set  of  people  will  say  that,  we  win  end  up  in  a  mode  where  we  are  always 
changing  everything  and  we  win  end  19  with  nothing.  We  win  lose  management  ctMurol 
In  tespoim  to  that  we  want  to  make  it  dear  that  this  recommeadaiiao  is  not  saying  we 
should  give  19  oonfigutato  managemem  on  system  devdopments,  and  that  every  time 
someone  does  an  evalnaiioodiat  we  should  change  something.  Thoe  should  be  two 
distina  process  one  that  is  evaluating  and  one  that  is  cbuging  and  we  would  expea  the 
rate  of  evaluation  is  much  higher  than  the  ram  of  change.  Howei^,  if  we  do  not  do  any 
evaluation,  these  will  be  no  changing,  and  then  we  will  end  up  widi  those  surprises  that  we 
talked  about  earlier. 
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FificHnjis 

•  Accounting  faf  human  pflormanf —rty  In  gy«twm  ■cquiaMon 
wiipnivOT  symm  cwpmHKf  wno  wtwwnctm  IM  flOMiy  ot  ms  imi 
■no  wwumm  pidgmh  do  pivGwi  opniiomi  pMCfonrancM 


o  SanHoo  hcguliltlon  octcuMvoo  anoum  that  the  douaioproom 
prognma  ampioy  fnaA4n4ha*loop 
mgulranianta  dannitiQn  and  malua 
wivi  uw  raramv  uviwgnciui  mM  MquMOon 


^C9vUPIv9  WpiOII  IMCIWiQICigy  C^MMRlM  ■•  COTSUIM 

computing  and  nahaorfdng  to  aimulala  cooidbialad  oomblnad 
anna  an^jagamants  taWi  manAvthAloop  and  avahiala  laauilB 


[Slide  18]  The  foanfareooomxQdadoa  has  »  do  with  the  baainfaetocs  problem  illusowed 
bySAODIH.  lUs  just  amply  poims  out  that  the  tools  ate  now  available  and  die  cost  is 
oifBeigmiy  low  that  eveiypropamsfaoald  have  an  activity  to  build  mock-ups  of  man 
machine  imofmes  as  soon  u  possOde,  and  bring  in  the  tw  usets  to  better  understand  the 
utility  of  that  design.  Ibai  should  be  upgiadedu  the  fidelity  of  the  devdopmentmxeeds. 
Now  thete  istme  set  of  systems  whete  this  is  paniculariy  duBcult  to  do,  anddiat  has  to  do 
with  big  command  and  oomnd  sysmms  where  the  ttsets  ate  gnerals  or  admiials  and  diere 
may  be  many  of  them.  It  is  not  easy  to  bring  them  in  to  try  thinp  out,  ami  so  hoe  we  are 
pointiDg  to  a  set  of  technology  that  is  emerging  for  doing  what  is  called  distnboied 
simulation;  that  is,  putiitv  tight  in  somebody’s  (rffice  his  pvt  to  pla^  a  global  shnulanoo 
and  having  a  coenman  war  game  underneath  that  shnuladoo  that  multiple  playen  located  in 

various  places  could  participate  in  coucuttendy.  Perhaps  through  that  tnechanism  we  can 

get  higher  level  peotfe  in  the  defense  depatttnent  to  pity  in  those  porrioBS  of  the  syswns 

Sin  th^in  faa  will  uMinatdy  be  the  user  far.  We  are  seeing  that  techntdogy  used  t^y 

for  tnin^  but  we  think  that  it  can  Stan  to  move  into  the  area  of  development  as  welL 
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Credibility  Of  M/S 


[Slide  19]  Hnally  I  want  to  make  some  leeommendaiions  with  regard  to  this  issue  of 
credibility. 


E-27 


IMPROVING  T&E  EFFECTIVENESS 


Questions  Raised  On  Credibility 


•  How  do  wo  know  whether  or  not  to  trust  a  simulation’s  output? 

ANSWER:  Excursion  analyses,  confimlng  test  data, 
documentation  bn  results 

•  Would  we  gain  confidence  by  setting  up  a  central 
accreditation  process  for  M/S7 

ANSWER:  No 

•  Would  we  gain  efficiency  by  setting  up  a  central 
process  to  manage  the  use  of  M/S? 

ANSWER:  No 


(Slide  20]  I  would  like  to  do  dist  by  tnswcrint  the  qoesiioas  we  were  asked  first  How  do 
we  know  whether  m  inui  dw  sisnlaiioo,  sfaoidd  we  set  up  dds  ceooal  oOSce  to  accredit 
smulanoos,  and  should  we  set  up  a  toanagezaent  proceo  to  disttibiiiB  and  reuse 
sixnulaiions?  Ouranswenaieasfidlows:  on  misting  sminlaiion,  we  should  not  be 
looking  for  single  point  answers.  We  dwdd  be  doing  dwse  excursion  analyses,  these 
sensitivity  analyses.  When  we  find  somedung  that  nakes  os  nervous,  we  should  run  a  test 
acddu{isdievalidaiioa,nactliesimnUtiao,aBiest  Weafaooldbavepro&ssioaally 
documented  sunuladan  results.  OflDnaSweseeistbeonechanihatwedidasiinnlanon 
and  this  is  the  answer,  instead  of  seeing  that  sHxile  set  of  data  that  makes  someone 
understand  how  this  simnlaiion  was  calibnad, »  what  depee  wu  the  extrqNslatioo  of  old 
daa  into  new  questions  made,  so  that  we  lave  an  ability  to  tadng  in  a  dnrd  party  who  could 
evaiuaticm  whether  tfainp  seem  credible  or  not  Should  we  set  10  this  cenim  office?  The 
answer  from  the  earlier  discussion  is  a  dear  ”00”;  there  is  no  office  we  think  that  could  do 
ic  it  is  not  feasible.  Should  we  set  op  a  dissibodoo  process  for  redistiibntingdiese 
sifflulatians?  Weil  if  we  cannot  accredit  them  we  ceitainfy  cannot  have  a  very  logical 
process  for  ledisetibuting  them,  so  we  sqr  00  0  that  ta  But  thats  not  to  say  that  we  are 
negative  about  credibility.  Wedohavescmerecommendarions. 
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Credibility  Recommendations 


•  Support  refinenient,  maintenance  and  availability  of 
models  wMch  are  reused  by  funding  executive  agents 
ttwough  JCS  and  OSO  (e-g.,  ONA^Iudear,  OIA-Threat) 

•  Selectively  use  indepmident  panels  of  experts  to  help 
evaluate  M/S  results,  when  confidence  is  unsure  and 
decision  is  key 

•  DAB  documentation  should  include 

•  M/S  plan  and  methodology 

•  M/S  limits,  assumptions,  extrapolations, 
sensitivities,  inputs  and  outputs 

«  Results  analysis  and  test  validation 

a  Use  more  than  one  model  for  the  same  analysis  so 
comparisons  can  be  made 


[Slide  21]  Ttae  sre  oerain  modeb  in  the  defense  depBimeat  dut  ind  to  be  used  and 
reused  aloe  by  an  expert  froop.  such  as  dte  Defease  Nuclear  Afeaey  (DNA)  i^els  on 
nuclear  efieos  or  the  DIAditeat  models.  Now  intexesdnflyeixxigh  in  the  DNA  case  these 

are  never  validated,  and  in  the  DIA  case  they  aften  lead  to  aiot  of  ixoblenis,  but 

nonetheless,  they  are  the  best  we  have.  Everybody  uses  them  and  those  Qrpes  of  models 

should  have  a  budget  line  to  teinftxce  their  mpromnem  and  ke^  diem  currenL  We 
rtoommend  that  the  JCS  aixl  OSD  in  fKt  do  that,  bm  diiect  the  mooey  direedy  to  diose 
groups  diat  in  fact  both  develop  and  use  the  ssnalstioas. 


If  a  subjea  comes  t9  where  simnlatiaa  seems  o  be  a  emdal  pan  of  it.  we  ddnk  that  there  is 
a  valirfatinw  that  can  be.  done  via  an  independent  pand.  YOQCaODOtvaikiaietesimuUDOn 
itself,  but  you  can  validaie  a  whole  bnnA  of  drinp  like,  the  people  who  desijpttte 

they  «Mm  to  know  what  they  are  tilkiniabem;^exg«polaiion.tto  IS  you 
■«!?  yw*«ft«*p»t***<"«iPdowiditfaelna>cncaioseoftheilniniiiiwtfortoyandanon 
versus  the  extrapolation  you  ate  doing  now.  Theinpatdaia:  aewepot^ggarteKinand 
bow  do  you  know  you  ate  not  bdsssifflnlaiiondoingapviialevalnanoDmafw 

_ « - • - - rnMleanStfftm  MS  IMMR  tffnMWtMflt  ntlA  ffift  1 


soon  penoQ  Cl  uXPBv  ID  aoa  11 16W  ow  icvd  wHuusM  u#  iw  w 
needed.  Thai  should  only  be  done  in  very  special  cases.  On  ^professional 

rt/v^mnwi^tinn  point  I  mttie  earlier,  todays  DAB  documentanoo  has  a  place  for,  but  does 

not  call  fir  the  daa  that  detertnxnes  what  is  the  basis  for  vahdaiing  simalaiion,  and  die 
credibility  fKton  and  so  we  are  saying  that  those  ihinp  in  fiKt  sbmild  become  pan  of  that 

documentadon.  Hnally  as  that  over-the-horiawt  radar  exttig>le  poinwxl  out,  you  can 

find  strange  results  occur  by  comparing  two  different  validaied  models  on  the 

same  extrapolated  pt^lem,  and  so  this  is  at  least  a  method  of  finding  out  wh^erthings 
seem  to  be  reasonably  sodile  or  not  That  is  what  we  have  to  say  about  credibility. 
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Summary 


•  Avoid  Operational  Test  and  Evaluation  Surprises  by: 

-  Doirig  more  testirig  and  evaluation  during 
development,  for  operatiorMi  learning  value 

-  Setting  up  an  Evaluation  Framework  for  M/S 
and  test  at  the  start  of  programs.  Involving 
the  independent  operational  testers. 

Reset  the  framework  over  time. 

•  Todays  acquisition  process  stifles  evaluation  during 
devewpment  Need  a  more  open  attitude,  and 
management  processes  to  encourage  operational 
evaluations  during  development  that  may  result  in 

it  would  not  add  confidence. 

-  Simulation  focuses  testing  through  sensitivity 

anSN^to 

>  Testing  valdaies  simulation 


[Slide  22]  So  dw  sanmniy  of  oir  stoify  tt  dut  we  thnk  that  it  is  rndly  itmonutt  to  avoid 
those  opoedoael  test  sotpnaes  both  for  xeaaoas  of  con  and  oedibUiqF.  loe  way  we  think 
yoQ  do  that  is  by  doiof  xnore  operaiioaal  tesdnt  daring  devdopnnu  so  that  we  kam  about 
the  problem  areas  wfaik  they  ate  sdn  fixable,  xaifaer  than  wait  oniil  the  very  end.  Tlieway 
we  see  seong  19  a  pooess  for  domg  diat  is  Uyi^  out  evaJnaaao  fiaaewQsks  wfaieb  ny  to 
piedia  how  we  wiu  do  evalaaiioii.  but  then  u  the  teal  progxam  progresses,  ^grading 
them  so  that  they  am  conaiwentwidi  the  state  of  kaotricdge.  Have  the  opetatkxtaltesien 
involved  so  that  we  do  DOC  tun  any  gm  of  change  that  am  onanridpated.  The  second  point 
we  make  is  die  aoqtnsitioa  prooess  stales  evalonoa,  and  unless  we  have  a  more  opa 
atrinide  to  doing  evabadcxi  it  does  not  maser  what  ovscbeaesac;  we  win  not  do  it  So 
totnedring  has  »  be  dope  at  the  i9perTnana|ememlevds  to  change  that.  RnaUy  we  do  not 
think  you  should  sec  up  independem  simulanaa  oCBces  a>  aocmdit  or  manage  the  use  or 
distribodooofsimnlanaas.  iliey  will  not  add  confidence,  they  would  add  confiisioa.  We 
think  that  die  way  to  dank  about  this  issue  of  confidence  is  that  shnnlaiian  focuses  tesdng 
into  those  areas  wfaem  we  do  not  have  confidence  and  be^  as  mcognim  those  places 
whemscnsidvityissafBcieatly^iestiooablethatitiswonhtesimf.  Hiatway  wegetour 
confidence;  throupi  msdng  mamed  with  siiBnlation. 
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